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escribe  the  solar  flare  data  base  and  outline  general  principles  for  effec¬ 
tive  data  management.  Three  statistical  techniques  for  solar  flare  proba¬ 
bility  forecasting  are  discussed  in  Section  3,  viz,  discriminant  analysis, 
logistic  regression,  and  multiple  linearregression.  We  also  review  two 
scoring  measures  and  suggest  the  logistic  regression  approach  for  obtaining 
24  hour  forecasts.  In  Section  4  a  heuristic  procedure  is  used  to  select  nine 
basic  predictors  from  the  many  available  explanatory  variables.  Using  these 
nine  variables  logistic  regression  is  demonstrated  by  example  in  Section  5* 

We  conclude  in  Section  6  with  broad  suggestions  regarding  continued  develop¬ 
ment  of  objective  methods  for  solar  flare  probability  forecasting.  «. 
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1.  INTRODUCTION 


Historically,  solar  flare  forecasting  methods  have  been  subjectively 
formulated,  relying  heavily  on  forecaster  insight.  This  report  addresses  the 
desire  for  an  objective  technique  for  solar  flare  probability  forecasting,  in 
light  of  the  importance  of  accurate  forecasts  to  the  scientific  community  and 
the  general  public. 

The  Space  Environment  Services  Center  (SESC),  a  part  of  the  NOAA  Space 
Environment  Laboratory  in  Boulder,  Colorado,  provides  24-hour  probability 
forecasts  of  regional  solar  flare  disturbances.  Variables  comprising  predic¬ 
tive  information  for  this  subjective  method  are  those  found  or  conjectured  to 
be  useful  by  SESC  forecasters  (Hirman  and  Flowers  (1979)).  The  "region 
analysis"  variables  thought  essential  to  flare  occurrence  serve,  as  well,  for 
our  development  of  an  objective  technique.  For  a  complete  discussion  of  goals 
and  services  of  the  SESC,  the  reader  is  referred  to  Heckman  (1979b)  and  Mangis 
(1975). 

Solar  flare  forecasts  made  by  the  SESC  predict  both  the  occurrence  and 
magnitude  of  flares.  Four  classes,  denoting  the  largest  event  in  a  24-hour 
period,  can  be  Identified:  (1)  no  flare,  (2)  class  C  flare,  (3)  class  M  flare 
and  (4)  class  X  flare.  Ranges  of  X-ray  yield  defining  flare  classes  are 
listed  in  Table  1  (see  Mangis  (1975)). 


Table  1.  Flare  Classification  by  X-Ray  Yield 


Class  Energy  Output  E  in  the  1-8  A  Spectral  Range 


C 

10* 6  <  E 

<  10"b  W/m2 

M 

10"  5  £  E 

<  10"4  W/m2 

X 

10" 4  <  E 

W/m2 

An  additional  coefficient  appended  to  the  letter  designator  indicates  the 
relative  intensity  within  the  appropriate  energy  range.  For  example,  an  X-ray 
class  M3  flare  would  yield  3x10"^  W/m2,  an  X5  would  yield  5x10“**,  and  so  on. 
A  non-energetic  flare  is  one  which  is  less  than  class  Cl,  that  is,  one  which 
produces  less  than  1x10” 6  W/m2.  It  is  the  moderate  class  M  flare  and  the 
major  class  X  flare  which  are  of  greatest  consequence  to  the  near-earth 
environment. 


If  we  let  Z  represent  a  random  variable  such  that  Z  ■  0,  1,  2,  or  3 
corresponding  to  the  largest  flare  which  occurs  in  the  next  24  hours  in  a 
selected  region,  then  estimates  are  provided  by  the  SGSC  for  the  following 
conditional  probabilities: 


(i)  Pr[Z-0|x] 

(ii)  Pr[Z>l|x] 

(iii)  Pr[Z>2|x] 

(iv)  Pr[Z-3|x]  , 


where  x^  denotes  an  observed  vector  of  prediction  variables  associated  with  the 
selected  region.1 

Objective  prediction  of  probabilities  (i)-(iv),  or  variants  of  these,  has 
been  accomplished  with  some  success  by  Hlrman,  et  al.  (1980)  using  the 
technique  of  multivariate  discriminant  analysis  and  by  Vecchla,  et  al.  (1980) 
using  logistic  regression.  These  results  demonstrate  potential  improvement  on 
subjective  forecasts  and  Indicate  that,  perhaps,  the  time  has  arrived  for  an 
expanded  effort  to  develop  an  objective  technique.  This  would  enable 
forecasters  to  attach  quantitative  significance  to  the  many  interrelated 
variables  comprising  the  inputs  to  any  forecasting  method. 

In  this  report  we  propose  the  technique  of  logistic  regression  for 
prediction  of  (i)-(iv).  Though  we  undertake  to  examine  only  24-hour 
forecasts,  with  continued  development  and  understanding  the  procedure  can  be 
applied  to  other  time  frames.  It  should  be  emphasized  that  a  complete 
evaluation  of  any  proposed  technique  can  result  only  from  comparison  to  the 
baseline  measure  provided  by  the  subjective  forecasting  system.  For  the 
example  we  provide  scores  for  intercomparison  of  logistic  regression, 
discriminant  analysis,  and  the  SESC  forecasts,  based  on  measures  already  used 
by  the  SESC. 

In  section  2  we  describe  the  current  solar  flare  data  base  and  outline  a 
general  data  management  procedure  which  is  essential  if  present  and  future 
records  are  to  provide  the  statistical  information  of  interest  and  importance. 
Section  3  is  a  brief  review  of  three  techniques  which  have  been  suggested  to 
forecast  solar  flares  — discriminant  analysis,  logistic  regression,  and 
multiple  linear  regression.  We  also  review  two  scoring  measures  and  discuss 
our  preference  for  the  logistic  regression  approach.  A  heuristic  procedure 
for  selection  and  transformation  of  variables  is  employed  in  section  4  to 
obtain  nine  basic  variables  for  the  examples  is  section  5.  We  conclude  in 
section  6  with  broad  suggestions  regarding  continued  development  of  objective 
methods  for  solar  flare  probability  forecasting. 


The  notation  "Z“0|j{"  is  read  "Z"0  given  >u"  For  example,  (iii)  is  the 
probability  that  an  M  or  X  flare  will  occur  in  the  next  24  hours  given  the 
predictors  x. 


2.  REGION  ANALYSIS  DATA 


2.1  Description  of  Data 

The  data  result  from  a  data  collection  and  analysis  scheme  initiated  by 
the  SESC  on  January  1,  1977.  These  observations,  collected  from  SESC  sensors 
and  from  cooperating  agencies  and  institutions,  reflect  the  complexity, 
magnetic  configuration,  age,  location,  and  past  history  of  active  solar 
regions.  All  explanatory  variables  included  in  the  SESC  record  are  available 
in  near  real  time  though  some  are  not  accessible  on  a  dally  basis.  Any  use  of 
the  SESC  data  base  to  develop  solar  flare  forecasting  techniques  should 
acknowledge  this  limitation. 

Data  collected  by  the  SESC  are  divided  into  four  categories:  white 
light,  H-alpha,  radio,  and  region  history.  In  addition,  the  variables  are 
mixed — continuous  and  discrete — and  some  are  dichotomous.  A  complete  list  of 
48  variables  and  brief  descriptions  of  each  is  provided  in  Appendix  A.  Some 
variables  are  recoded  and/or  reordered  versions  of  the  original  SESC 
observations.  For  the  most  part,  this  recoding  was  based  on  a  reassessment  by 
staff  forecasters  of  the  relation  of  the  variables  to  solar  flare  activity. 

An  abridged  list  of  available  information  is  presented  in  Table  2. 
Variables  listed  are  those  considered  in  the  current  study  and  do  not  include 
information  on  the  location  of  active  regions  on  the  sun.  It  is  indicated  if  a 
variable  is  continuous  or  discrete  and,  if  discrete,  the  number  of  distinct 
levels  assumed.  Also  noted  is  the  source  of  each  variable.  Additional 
information  in  Table  2  reflects  the  fact  that  variables  range  from  completely 
objective  measurements  to  highly  subjective  forecaster  observations,  such  as 
visual  evaluation  of  optical  telescope  photographs.  This  range  of  objectivity 
has  been  coded  into  three  categories  by  SESC  forecasters. 

The  current  study  utilizes  6097  region-day  records  collected  from  January 
1,  1977  to  January  31,  1979.  These  record  were  reduced  to  4487  records  by  the 
elimination  of  records  indicating  the  absence  of  sunspots,  since  such  regions 
rarely  produce  flares.  For  these  cases,  two-way  crosstabulations  of  FLARER 
(Variable  39)  with  other  variables  are  given  in  Appendix  C. 


Table  2.  Region  Analysis  Variables 


Number^ 

Name 

Type2 

■J 

SourceJ 

Description^ 

1 

DATE 

D 

SESC 

Year,  month,  day  (3) 

8 

AGE 

D-l  5 

SESC 

Age  of  region  (3) 

10 

MAGCLAS 

D-7 

SESC 

Magnetic  class  (2) 

11 

RV 

D-3 

MW 

Magnetic  field  strength  polarity  (3) 

12 

MAGSTR 

D-98 

MW 

Magnetic  field  strength  (3) 

13 

MAGGRAD 

C 

MW 

Magnetic  gradient  in  gamma/km  (3) 

14 

SSDYNAM 

D-4 

SOON 

Sunspot  dynamics  (1) 

15 

SS INTER 

D-2 

SESC 

Interaction  with  another  region  (1) 

16 

STGDEV 

D-6 

SESC 

Stage  of  development  (2) 

19 

SECTEOW 

D-8 

SESC 

Relationship  with  nearest  sector 
boundary  (3) 

20 

PLAGFIL 

D-6 

SOON 

Plage  compactness  and  embedded 
filament  (1) 

21 

NEUTLOR 

D-5 

SOON 

Main  neutral  line  orientation  within 
plage  (1) 

22 

REV POL 

D-2 

MW 

Orientation  within  plage  (3) 

23 

NEUTLCOM 

D-5 

SOON 

Neutral  line  complexity  (1) 

24 

NEUTLCHG 

D-3 

SOON 

Neutral  line  temporal  changes  (1) 

25 

ASSOCFIL 

D-5 

SOON 

Associated  filament  (2) 

26 

BRTPTS 

D-3 

SOON 

Bright  points  (3) 

27 

PLAGFLUX 

D-2 

SOON 

Plage  fluctuations  (3) 

28 

ISOPOLE 

D-2 

SOON 

Isolated  Pole  (2) 

29 

EFR 

D-3 

SOON 

Emerging  flux  (2) 

30 

AFS 

D-2 

SOON 

AFS  (3) 

33 

FIRSTAPP 

D-7 

SESC 

Regions  first  appearance  (3) 

35 

CRFOR 

C 

SESC 

C  flare  forecast  for  region 

36 

MRFOR 

C 

SESC 

M  flare  forecast  for  region 

37 

XRFOR 

C 

SESC 

X  flare  forecast  for  region 

39 

FLARER 

D-4 

SESC 

Largest  flare  in  next  24  hours 

41 

FLUX 

C 

SESC 

10  cm  flux  (3) 

46 

FLARERT 

D-4 

SESC 

Largest  flare  today  (3) 

47 

RECSPOT 

D-8 

SESC 

Recoded  sunspot  class  (2) 

^Variable  numbers  correspond  to  those  of  the  complete  data  list  In  Appendix  A. 
^Variable  type  codes  are:  C  ■  continuous;  D-n  ■  discrete-number  of  levels. 
^Sources  are:  Space  Environment  Services  Center  (SESC);  Solar  observing 

optical  network  (SOON);  Mt.  Wilson  (MW),  Boulder,  CO. 

^Description  parenthetic  codes  denote  level  of  objectivity  for  the  measurement: 
1  *  least  objective,  2  *  moderately  objectively,  3  -  most  objective. 


2.2  Recommendations  for  Data  Management 

From  the  outset  of  this  study  it  was  apparent  that  data  access  and 
retrieval  difficulties  would  arise,  though  the  actual  extent  of  problems  was 
not  anticipated.  Through  September  30,  1979,  four  blocks  of  data  representing 
8001  SESC  region-day  cases  from  January  1.  1977,  through  June  30,  1979,  have 
been  transferred  for  analysis  to  the  NOAA  CDC  6600  computer.  Following 
substantial  delays  for  coding  and  keypunching  blocks  1-3  (Jan.  77-Dec.  77; 
Jan.  78-June  78;  July  78-Jan.  79),  it  was  recommended  that  the  SESC  develop 
capability  for  the  direct  transfer  of  data  from  local  terminals  in  near  real 
time  — perhaps  on  a  monthly  basis.  The  necessary  software  was  tested  in  the 
transfer  of  the  fourth  data  block,  consisting  of  1904  data  cases  from  February 
1,  1979  through  June  10,  1979. 

Preliminary  examination  of  the  data  revealed  many  inconsistencies  and/or 
errors.  To  the  extent  possible,  errors  detected  in  blocks  1-3  were  corrected 
by  SESC  staff  members.  In  some  cases,  impossible  values  were  recoded  as 
missing,  resulting  in  a  loss  of  information.  Extensive  and  serious  problems 
with  data  set  4,  unless  they  are  resolved,  will  prohibit  any  analysis  which 
could  be  expected  to  provide  reliable  information. 

The  importance  of  data  collection  and  management  cannot  be  exaggerated. 
The  difficulties  encountered  in  the  course  of  the  present  study  will  preclude 
a  rigorous  and  completely  reliable  analysis  of  information  contained  in  the 
data  base.  Such  basic  problems,  if  not  satisfactorily  resolved,  may  result  in 
future  records  which  cannot  possibly  provide  the  statistical  information  of 
interest  or  importance.  To  accomplish  a  reliable,  near  real-time  data 
transfer  and  analysis  scheme  will  require  the  following  important  components: 


1.  A  reliable  on-line  procedure  for  coding,  recording,  and  local 
storage  of  region  analysis  data  cases. 

2.  An  error  free  software  system  for  the  (direct)  periodic 
transfer  of  region  analysis  data  to  the  larger  computer  system 
required  for  complex  statistical  analyses.  This  package 
should  provide  a  complete  data  record,  Including  variables 
created  or  recoded  from  original  variables. 

3.  Local  (SESC)  magnetic  tape  storage  of  original  data  sets 
until  the  complete  verification  of  a  successful  transfer  to 
the  larger  computer  is  accomplished. 

4.  Magnetic  tape  storage  of  data  sets  on  the  larger  computer  at 
the  time  successful  transfer  is  verified.  This  should  be  in 
duplicate  if  possible,  because  of  the  importance  and  size  of 
the  data  base. 


5.  An  efficient  software  package  for  data  base  management  on  the 
large  computer.  The  software  should  allow  listing  and  editing 
of  records.  Programs  to  scan  data  for  detectable  inconsis¬ 
tencies  should  be  developed  and  used  regularly.  To  date,  the 
SPSS  statistical  programs  package  (Nie,  et  al.  (1975))  has 
been  employed  for  data  management,  and  has  the  additional 
advantage  of  providing  descriptive  and  analytic  statistical 
procedures  useful  for  studying  the  solar  flare  data. 


It  should  be  emphasized  that  attention  unnecessarily  devoted  to  matters 
of  data  management  and,  in  particular,  to  correction  of  recording  or  transfer 
errors,  will  postpone  reliable  statistical  analyses.  In  all,  it  would  be  fair 
to  estimate  that  a  major  portion  of  the  effort  to  date  has  been  expended  to 
identify  and  correct  avoidable  data  base  errors.  Many  errors,  to  be  expected 
in  the  tedious  procedure  of  recording  and  keypunching  vast  amounts  of  data, 
can  be  eliminated  if  sufficient  resources  are  devoted  to  accomplish  the  above 
system  of  recording  and  management. 


3.  SOLAR  FLARE  FORECASTING  METHODS 


Solar  flare  forecasts  made  by  the  SESC  predict  both  the  occurrence  and 
magnitude  of  flares.  Four  classes,  denoting  the  largest  event  in  a  24-hour 
period  can  be  identified: 

(1)  No  Flare 

(2)  C  Class  Flare 

(3)  M  Class  Flare 

(4)  X  Class  Flare 

It  is  the  moderate  class  M  flare  and  the  major  class  X  flare  which  are  of 
greatest  consequence  to  near-earth  environmental  disciplines. 

If  we  let  Z  represent  a  random  variable  such  that  Z  =  0,  1,  2,  or  3 

corresponding  to  the  largest  event  (i.e.,  flare)  which  occurs  in  the  next  24 
hours  in  a  selected  region,  then  estimates  are  provided  by  the  SESC  for  the 
following  conditional  probabilities: 

(i)  Pr{Z-0|x] 

(ii)  Pr[Z>l|x]  (3.0.1) 

(iii)  Pr[Z>2|x] 

(iv)  Pr[Z»3|x] 

where  _x  denotes  an  observed  vector  of  prediction  variables  associated  with  the 
selected  region. 


Objective  prediction  of  probabilities  (i)-(iv),  or  variants  of  these  has 
been  accomplished  with  some  success  by  previous  investigators  employing  a 
variety  of  methods.  Applicable  papers  include  Hirman,  et  al.  (1980) 
— discriminant  analysis;  Vecchia,  et  al.  (1980) — discriminant  analysis  and 
logistic  regression;  and,  Jakimiec  and  Wasiucionek  (1980) — multiple  linear 
regression.  Successful  application  of  an  objective  technique  on-line  at  the 
SESC  will  enable  forecasters  to  attach  immediate  quantitative  significance  to 
the  many  interrelated  variables  comprising  the  inputs  to  any  forecast  method. 

In  section  3.1,  we  describe  the  general  problem  of  probability 
forecasting  and  review  two  methods  which  are  consistent  with  SESC  practice  and 
well-suited  for  on-line  solar  flare  forecasting  using  a  few  variables 
currently  monitored  at  the  SESC.  We  also  briefly  discuss  the 
inappropriateness  of  multiple  regression  analysis  for  solar  flare  forecasting 
within  the  present  framework.  Evaluation  measures  for  probability  forecasts 
are  reviewed  in  section  3.2,  and  we  conclude  by  noting  in  section  3.4.2  that 
regression  analysis  could  be  a  useful  approach  to  relate  flare  intensity  to 
active  region  variables  if  minor  revisions  in  data  collection  can  be  effected. 


3.1  Probability  Forecast  Methods 

Let  Z  denote  a  random  variable  and  x  a  p  *  1  vector  of  observable  random 
variables  used  as  explanatory  or  predictor  variables  for  Z.  We  are  interested 
in  the  probability  structure  of  Z.  All  information  regarding  Z  given  an 
observed  jt  is  contained  in  the  conditional  probability  distribution,  F(Z|jc)  = 
Pr[Z<z|xJ.  We  therefore  define  a  probability  estimate  to  be  an  estimate 
F(Z|x)  of  the  conditional  probability  distribution.  To  describe  the 
statistical  methods  useful  for  flare  forecasting  we  consider  only  estimation 
for  a  dichotomous  random  variable  where  Z=1  (success)  or  0  (failure).  For 
example,  Z*1  might  represent  the  occurrence  of  a  C,  M  or  X  class  flare,  and 
Z=0  the  non-occurrence  of  any  class  of  flare.  In  subsection  3.1.3  we  state 
models  for  dependent  variables  with  more  than  two  categories  and  note  refer¬ 
ences  for  this  extension. 


i 


i 


3.1.1  Discriminant  Analysis  (DA) 

The  general  problem  is  to  relate  a  categorical,  or  discrete,  dependent 
variable  to  one  or  more  predictor  variables,  which  may  or  may  not  be 
categorical.  That  is,  based  on  one  or  more  measurements  x  we  wish  to  classify 
an  observation  (element)  into  one  of  two  or  more  categories  (populations)  on 
the  basis  of  x.  It  can  be  assumed  that  each  category  (population)  is 
characterized  by  a  probability  distribution  of  x. 

Suppose  that  observations  belong  to  two  distinct  populations  Pg  and  P^, 
characterized  by  joint  probability  density  functions  fg(x)  and  fi(*)» 
respectively.  For  example,  Pj  might  denote  the  set  of  region-day  occasions 
which  produce  at  least  a  class  C  flare  in  the  next  24  hours,  and  Pg  the  set  of 
cases  producing  no  flare.  Also,  let  the  population  membership  of  a  case  j  be 
given  by  the  random  variable  Zj,  where  Zj"l  if  x.j  has  distribution  fj,  and 
Zj*0  if  xj  is  chosen  from  fg.  Let  the  prior  probability  that  an  observation 
belongs  to  be  Pr[Zj-l]  -  pi ;  hence,  Pr[Zj-0]  -  pg  -  1-py. 
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It  is  usually  assumed  that  for  Pq  and  P^,  the  random  vector  x.  has  3 
multivariate  normal  distribution  with  different  mean  vectors  _Uq  and  3,  but 
common  covariance  matrix  £.  That  is,  for  x  a  p  *  1  vector  we  have 


fi(x)  -  [(2it)P/2|  e|1/2j-1  exp  [-(l/2)(x  -  3)*  JT^x  -  ±i>)- 


Then  for  a  given  random  observation  and  associated  vector  x,  it  can  be  shown 
that 


Pr[Z-l|x]  - 

{l  +  (pq/pi)  exp  -  [x  -  (1/2)  (3  +  _Uo>  ]  ’  F*1  [3  -  3)1  l-1  *  (3.1.1) 


which  is  of  the  form 


Pr(Z«l|x]  -  {l  +  exp  [-  (  a  +  J0'x)  ]  }-l 


(3.1.2) 


where 


0'x 


1  SjXj. 

J-l 


To  construct  a  probability  estimation  technique  to  classify  random 
observations,  we  require  estimates  of  a  and  0.  For  the  formulation  of  the 
problem  assuming  normality  of  x,  which  leads  to  (3.1.1),  it  is  sufficient  to 
estimate  jig,  3,  E,  and,  if  necessary,  pq.  Suppose  jxpj,  j-l ,  ...,  ng }  and 
1*1  j*  j*l ,  ...,  3 }  are  random  samples  of  observations  from  Pq,  Pj, 

respectively.  Thus,  we  are  given  a  set  of  cases  with  known  population 
memberships.  Then,  to  estimate  a  and  0  we  use 


1  "  s_1(il  "  *D> 


(3.1.3) 


a  -  -Ln(no/ni)  -  .5  (3  +  3)  '  j? 


where  S  is  the  pooled  estimator  of  the  common  covariance  matrix  E,  and  3  and 
3  are  sample  mean  vectors.  The  estimators  (3.1.3)  will  be  called 
discriminant  function  estimators  (DFE's)  of  (a,  _0)  and  by  discriminant 
analysis  (DA)  we  mean  the  use  of  DFE's  in  (3.1.2)  to  obtain  probability 
estimates  for  cases  with  unknown  population  membership. 
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3.1.2  Logistic  Regression  (LR) 


The  function  (3.1.2)  is  the  logistic  response  function,  or  logit,  a 
symmetric  sigmoid  curve.  It  appears  to  be  a  reasonable  model  for  probability 
forecasting  because,  as  a  monotone,  smooth  function  of  Pr[Z»l|x]  is 
bounded  between  0  and  1  and  approaches  these  values  as  limits  as  Xj*-*  for 
any  j. 

In  section  3.1.1,  the  assumption  that  x  is  normally  distributed  within 
each  population  resulted  in  probability  forecasts  of  the  logistic  form. 
However,  many  types  of  underlying  assumptions  about  x  lead  also  to  a 
prediction  equation  of  the  logistic  form.  For  example,  the  logistic  results 
if  some  predictor  variables  are  multivariate  normal  and  others  are 
dichotomous,  so  that  the  logistic  model  is  appropriate  for  more  general 
distributions  of  x  in  addition  to  multivariate  normal. 

We  will  mean  by  logistic  regression  (LR)  the  procedure  by  which 
statistical  maximum  likelihood  estimators  (MLE’s)  of  (a,  _B)  are  obtained  for 
the  logistic  regression  model  (3.1.2).  Thus,  the  LR  formulation  of  the 
problem  assumes  that  the  probability  function  should  have  the  characteristics 
of  the  (nonlinear)  logit  function,  and  then  approximates  the  true  curve  by 
iteratively  estimating  (a,  8)  directly.  The  interested  reader  is  referred  to 
Coldstein  and  Dillon  (1978)  or  Bishop,  et  al.  (1975)  for  a  further  discussion 
of  these  concepts. 


3.1.3  Extension  to  Polychotomous  Case 

In  the  context  of  this  study,  Z  may  be  thought  of  as  the  polychotomous 
random  variable  representing  the  largest  flare  occurring  in  a  future  24-hour 
period.  The  dichotomous  logistic  model,  which  is  the  basis  for  LR  and  DA, 
generalizes  easily  for  estimating  the  probabilities  of  k  events  as  a  function 
of  one  or  more  explanatory  variables. 

Suppose  that  observations  belong  exclusively  to  one  of  k  populations 
Pq, • • . ,  characterized  by  joint  probability  density  functions 
foWf’ifk-lW'  where  jc  is  a  p  *  1  vector  of  explanatory  variables.  Also, 
let  the  population  membership  of  an  observation  be  given  by  a  random  variable 
Z,  where  Z  *  i  if  x  is  from  fj(x).  If  the  probability  distributions  are 
multivariate  normal  with  common  covariance  matrix,  but  different  mean  vectors, 
then  the  posterior  probability  that  an  observation  is  drawn  from  Pj  takes  the 
form 


k-1  _! 

Plz-il*]  *  exp^’jSi  )  *  l  l  exp(^'  Bj  )  J  ,  i  -  0,  . . . ,  k-1 ; 

j-o 

where  jr*  -  [l,  xlt  . . . ,  xp  ]  ,  and  _0j’  -  [Bjo.  6jl . 6jp] 

The  one  is  annexed  to  the  vector  of  explanatory  variables  to  allow  for  a 
constant  in  the  model. 
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As  in  the  dichotomous  case  DFE's  are  obtained  if  multivariate  normality 
is  assumed,  and  the  estimators  of  the  Jij's  are  functions  of  the  sample  mean 
vectors,  sample  covariance  matrix,  and  prior  probabilities.  LR  involves 
direct  maximum  likelihood  estimation  of  the  ^6,'s.  For  a  detailed  discussion 
of  LR  in  the  polychotomous  case  see  Jones  (1968)  and  Jones  (1975). 


3.1.4  Multiple  Linear  Regression 

Although  methods  for  the  analysis  of  categorical  data  are  well  developed 
and  have  been  discussed  in  the  statistical  literature  for  many  years,  many 
data  analysts  of  categorical  data  continue  to  use  inappropriate  methods.  In 
particular,  discrete  variables  are  often  treated  as  ordinary  variables  in 
regression  analysis  without  regard  to  assumptions  of  continuity. 

Suppose  (Zj ,xj) , jMl ,  •  •  •  ,n,  is  a  random  sample  of  observations  from  (Z,x), 
where  _x  is  our  usual  vector  of  explanatory  variables  and  Z  is  1  or  0 
corresponding  to  the  population  from  which  x  is  drawn.  Thus,  as  before,  we 
have  a  set  of  cases  with  known  population  memberships.  Some  of  the  variables 
ic  may  be  categorical  and  some  may  vary  continuously.  In  such  cases  Pr[Zml|x] 
may  be  estimated  by  linear  regression  methods.  The  standard  regression  model 
would  be 


zj  *  2Lj'i  +  ej  »  J"1 . . 


(3.1.4) 


where  ej  denotes  a  random  error  term  with  E(ej)  ■  0;  Var(ej)  ■  o2;  E(ejejc)  ■  0 
if  J*k.  E(-)  denotes  expected  value  and  Var(  •)  denotes  variance.  It  follows 
that 


E(Z|x)  -  Pr[Z-l|x] 
-  x’  B  , 


(3.1.5) 


where,  for  convenience, _  we  have  dropped  the  subscript.  Then,  given  the  usual 
regression  estimator  _B,  from  (3.1.4)  we  obtain  a  probability  estimate  for 
cases  with  unknown  population  membership  using 


Z  -  P'rjz-lix]  -  x’  B  . 


(3.1.6) 


This  is  similar  to  the  Regression  Estimation  of  Event  Probabilities  (REEP) 
procedure  of  Miller  (1964). 
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Use  of  (3.1.6)  for  probability  forecasting  presents  some  obvious 
difficulties.  Clearly  the  estimates  are  not  constrained  to  be  between  0  and 
1.  Further,  if  it  is  reasonable  that  as  jc'jS  Increases  so  also  does  the  chance 
that  Z  will  be  one,  then  true  probability  forecast  function  should  generally 
have  the  S  shape  (sigmoid)  since  it  must  be  nondecreasing  and  bounded  between 
0  and  1.  To  approximate  such  a  curve  with  a  straight  line  may  be  reasonable 
over  a  portion  of  the  curve,  but  inadmissible  for  extreme  values  of 

Finally  we  note  also  that,  under  our  assumptions,  for  given  jtj,  Z.  Is  a 
Bernoulli  random  variable  (see  Mood,  et  al.  (1974)).  It  follows  that  E(7.j!xj) 
«  xj'j*  and  Var(Zjlxj)  -  Var(ej'  -  x_j * _BC  1  -  This  Is  a  direct  violation  of 
the  equal  variance  assumptions  for  (3.1.4),  since  Var(ej)  depends  on  xj.  Use 
of  ordinary  least  squares  regression  estimators  under  these  conditions  will 
yield  imprecise  predictions. 

Possible  modifications  have  been  proposed  to  eliminate  technical 
difficulties  in  applying  linear  regression  to  categorical  data.  However, 
extensive  delineation  of  the  problems  or  solutions  for  the  linear  model 
approach  is  beyond  the  scope  of  this  report.  Theoretical  and  empirical 
details  regarding  inadmissibility  of  this  method  may  be  found  in  Nerlove  and 
Press  (1973)  and  Brelsford  and  Jones  (1967). 


3.2  Scoring  Probability  Forecasts 

For  this  report  the  major  intended  purpose  of  scoring  forecast  methods  is 
to  provide  a  measure  of  the  usefulness  of  proposed  statistical  models  relative 
to  the  subjective  forecasts  of  the  SESC.  It  should  be  emphasized  that  a 
complete  evaluation  of  a  proposed  technique  can  result  only  from  a  long  term 
comparison  to  the  procedure  of  the  SESC.  However,  as  a  preliminary  measure  of 
utility,  the  objective  flare  forecast  techniques  will  be  tested  by  comparing 
probability  estimates  from  fitted  models  to  the  baseline  subjective  forecasts. 
Two  scoring  procedures  discussed  by  Brelsford  and  Jones  (1967)  are  considered. 

Probability  estimates  or  forecasts  may  be  compared  by  a  loss  function, 
h(Z,  Z).  The  Brier  Score  (Brier  (1950))  used  in  meteorology  is  essentially 
mean  square  error.  For  the  general  model,  it  assigns  a  loss  of 


h(Zij,  Ztj)  -  \  (Zjj-Zij)2 

J-l 


to  the  i-th  trial  (the  subscript  j  denotes  the  event),  where  we  use  Zjj  to 
denote  an  estimate  of  Pr[Z£-jlx^],  and  where  Z^j-1  if  the  i-th  trial  produced 
event  J  and  0  otherwise.  Given  a  set  of  n  trials,  the  Brier  Score  for  a 
forecast  method  M  is: 


n  k  « 

Brier(M)  -  (1/n)  l  l  (Zt ^ .<*))  .  (3.2.1) 

i-1  j-l 
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Another  natural  loss  function  derived  from  information  theory  assigns  a 
loss  of  -logZj/m\  to  the  i-th  trial,  where  m  is  the  event  which  occurred.  For 
forecast  methoa  M,  the  Information  Loss  Score  is  given  by: 


Info(M) 


(1/n)  l  log  Z 
i-1 


(M) 

i(m)‘ 


(3.2.2) 


Given  a  finite  sample  of  data,  this  score  is  minimized  by  maximum  likelihood 
estimation  of  the  parameters  in  the  logistic  function,  i.e.,  by  LR  estimators 
of  (a,j5)  in  (3.1.2). 

For  both  scoring  measures  it  is  desirable  to  achieve  a  minimum  score. 
The  Information  Loss  Score  may  be  preferred,  however,  because  probability 
estimates  are  constrained  to  the  range  0<Z<1 .  A  probability  prediction  of 
zero  is  unacceptable  if  the  event  occurs,  since  the  loss  would  be  +®. 

Evaluation  and  comparison  of  probability  estimates  from  many  sources  is  a 
lengthy  subject.  General  mathematical  definitions  of  scoring  methods  with 
desirable  practical  properties  have  been  determined  (see  for  example,  Murphy 
and  Epstein  (1967)).  A  modification  by  Sanders  (1963)  partitions  the  Brier 
Score  into  components  to  determine  finer  aspects  of  skill  for  a  given  forecast 
method.  The  interested  reader  is  referred  to  Heckman  (1979a)  for  a  discussion 
of  these  concepts. 


3.3  Discussion 

Discriminant  analysis,  logistic  regression,  and  multiple  linear 
regression  are  only  representative  of  a  larger  number  of  possible  objective 
approaches  to  the  solar  flare  forecasting  problem.  Alternative  techniques  are 
not  discussed  in  this  report  because,  given  the  current  data  collection  scheme 
and  desired  SESC  product,  we  believe  that  these  three  statistical  methods  are 
the  only  easily  adaptable  solutions  to  the  solar  flare  forecasting  problem  as 
presently  formulated.  We  qualify  this  notion  by  commenting  that  the  proper 
application  of  any  statistical  method  demands  approximate  consistency  of  the 
sample  data  with  the  theoretical  foundations  (i.e.,  assumptions)  of  the 
technique.  Our  preliminary  analyses  suggest,  in  fact,  that  to  accommodate 
basic  assumptions  will  require  clever  data  manipulation  and/or  modification  of 
an  objective  approach. 

To  determine  reasonable  compliance  with  basic  assumptions  will  require 
the  following  areas  of  concern,  which  are  relevant  to  one  or  more  of  the  three 
methods  discussed  above: 


1.  Independence  of  observations.  Detailed  examination  of  spatial 
and/or  time  correlation  properties  of  solar  flare  data  is 
essential,  but  remains  to  be  done.  Possible  problems  have  been 
largely  Ignored  for  the  current  study. 


TTTTTSraBEH 
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Continuity  and/or  normality  of  variables.  For  example,  DFE's 
are  derived  assuming  normality  of  the  explanatory  variables. 
Practically,  we  require  "approximate"  conformance  to  this 
assumption  so  that  it  is  sometimes  within  reason  to  model 
discrete  scaled  data  using  "normal  theory." 

Invariance  of  model  parameters  in  time  and  space.  To  follow 
secular  variation  in  the  solar  cycle  may  require  adaptive 
models.  This  may  involve  a  simple  periodic  recomputation  of 
model  estimates  but  could  require  a  theoretical  modification  of 
a  particular  statistical  method.  Also,  we  have  not  accounted 
for  possible  location  differences. 

Equality  of  variances  or  covariances  (for  explanatory 
variables)  among  flare  class  groups.  For  example,  to  account 
for  unequal  variances  could  require  the  inclusion  of  quadratic 
terms  in  a  logistic  model.  We  have  chosen,  instead,  to  keep  the 
number  of  variables  to  a  minimum  by  using  variance  stabilizing 
transformations  on  some  variables. 


3.4  Recommendations 
3.4.1  Probability  Forecasting 

Press  and  Wilson  (1978)  note  that: 

"Discriminant  function  estimators  have  often  been  used  in  logistic 
regression,  in  both  theory  and  applications.  When  such  estimators  were 
compared  empirically  with  maximum  likelihood  estimators  for  logistic 
regression  problems,  however,  they  were  found  to  be  generally  inferior, 
although  not  always  by  substantial  amounts... 

The  rationale  for  a  logistic  formulation  of  the  relationship  between 
qualitative  and  other  variables  . . .  has  been  discussed  extensively  in  the 
literature  ..." 


We  suggest  consideration  of  the  logistic  formulation  for  the  solar  flare 
probability  forecasting  problem  for  many  of  the  reasons  alluded  to  in  the 
above  statement.  For  the  normal  theory  case,  we  derived  a  logistic  response 
function  model  and  noted  that  DA  is  appropriate  to  obtain  a  fitted  equation 
for  probability  forecasting.  It  was  observed,  however,  that  many  types  of 
underlying  assumptions  about  explanatory  variables  lead  also  to  a  logistic 
form  of  the  prediction  equation.  For  example,  the  logistic  results  if  some 
predictor  variables  are  multivariate  normal  and  others  are  dichotomous,  so 
that  LR  is  appropriate  for  more  general  distributions  than  multivariate 
normal.  The  DA  approach  is  strictly  applicable  only  if  predictors  are 
normally  distributed,  with  complete  equality  of  underlying  covariance 

mo  t  *  I •  o o 
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To  summarize  common  objections  to  the  general  use  of  DFE's  we  list  the 
following  arguments  stated  by  Press  and  Wilson  (1978): 


1.  If  explanatory  variables  do  not  follow  a  multivariate  normal 
distribution  with  equal  covariance  matrices  among  groups,  DFE’s 
of  the  slope  parameters  (f5,'s)  in  the  logistic  function  are  not 
“consistent."  In  particular,  this  means  that  if  the  predictor 
variables  are  dichotomous,  we  cannot  expect  to  obtain  accurate 
forecasts  with  DFE's,  even  with  an  infinite  amount  of  data. 
This  result  is  proven  in  Halperin,  et  al.  (1967). 

2.  When  the  normality  assumption  is  violated,  meaningless 
variables  will  tend  to  be  erroneously  included  in  the  logistic 
function  with  DFE's. 

3.  Use  of  DFE's  tends  to  mask,  troublesome  situations.  For 
example,  parameter  estimates  may  (correctly)  fail  to  exist  with 
LR,  but  DFE's  may  be  erroneously  computed. 

4.  There  is  evidence  that  DFE's  may  tend  to  generate  bias  in  some 
applications. 


For  the  solar  flare  probability  forecasting  problem,  most  of  the  explanatory 
variables  are  categorical  and  some  are  dichotomous.  Our  present  judgment  is 
that  LR  is  a  more  defensible  approach  under  these  circumstances. 

We  do  not  provide  additional  support  for  the  LR  approach  here.  Important 
comparisons  may  be  found  in  Brelsford  and  Jones  (1967),  Halperin  et  al. 
(1967),  and  Press  and  Wilson  (1978).  We  remark  that  in  a  similar  application, 
LR  is  used  by  the  National  Weather  Service  to  forecast  conditional 
probabilities  of  frozen  precipitation  (Glahn,  et  al.  (1973)). 


3.4.2  Estimation  of  Flare  Intensity 

In  section  3.1.4,  we  reviewed  specific  objections  to  the  use  of  multiple 
linear  regression  to  obtain  probability  forecasts.  The  major  reason  for 
rejecting  linear  regression  in  this  study  is  the  inappropriateness  of  the 
method  to  “predict"  a  discrete  dependent  variable.  Because  the  choice  of 
forecast  methods  is  dictated,  in  part,  by  the  chosen  scale  for  recording  data, 
inadmissibility  of  linear  regression  Is  contingent  on  the  present  data 
collection  scheme. 

Specifically,  the  coded  SESC  flare  classifications  (i.e.,  classes  C,  M, 
X)  are  actually  a  categorization  of  peak  X-ray  yield,  as  measured  by 
satellites.  Recoding  a  continuous  measurement  into  a  discrete  classification 
may  induce  a  substantial  loss  of  information  and  greatly  restricts  the  range 
of  valid  data  analysis  methods.  Information  lost  in  recoding  can  be  recovered 
only  at  great  expense,  yet,  if  peak  flux  were  recorded  on  the  original  scale, 
£he  SESC  classes  can  be  assigned  with  little  effort.  We  suggest  that  any 
continuous  measurement  be  recorded  in  its  original  scale. 
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Availability  of  peak  flux  measurements  could  allow  the  SESC  to  formulate 
a  linear  regression  model  to  forecast  actual  X-ray  yields  for  active  regions. 
If  this  were  a  desirable  product,  and  assisting  that  the  explanatory  variables 
are  suitably  correlated  with  flare  intensity,  linear  regression  has  the 
following  advantages. 


1.  Linear  regression  is  simple,  both  conceptually  and  comput- 
tatlonally . 

2.  Transformed  or  interaction  variables  do  not  increase  the 

complexity  of  linear  regression.  Thus,  functions  of 

explanatory  variables  are  easily  included  in  the  analysis. 

3.  Confidence  intervals  for  estimated  peak  flux  are  readily 
obtainable. 

4.  The  extension  to  forecasts  for  any  lead  time  is  straight¬ 
forward.  In  particular,  if  variables  were  recorded  on  time 
scales  less  than  24  hours,  no  significant  complications 
arise. 


In  conclusion,  we  remark  that  if  explanatory  variables  are  rescaled 
(presumably  yielding  "more  continuous"  scales)  during  a  conversion  to  less 
subjective,  automatic  measurement  techniques,  desirability  of  linear  and 
nonlinear  regression  methods  for  modeling  the  solar  flare  process  is  likely  to 
Increase. 


4.  SELECTION  OF  PREDICTORS 


4.1  Transformation  of  Variables 

Explanatory  or  predictor  variables  may  actually  be  transformations  of 
more  basic  variables.  Because  a  thorough  examination  of  potential  explanatory 
information  may  enhance  understanding  of  the  solar  flare  process,  the 
following  types  of  computed  variables  are  identified: 


1.  Log  and  power  transformations  of  basic  variables.  Typical 
purposes  are  to  stabilize  variance  or  to  induce  normality. 

2.  "Functions”  of  basic  variables.  Loosely  speaking,  these 
represent  interactions  among  variables  and  should  be 
suggested  by  expert  scientists.  Rarely  should  functions  be 
selected  based  on  empirical  evidence  of  association  with 
flare  occurrence. 

3.  Lagged  or  rate  of  change  variables.  For  example,  use  of 
lagged  variables  can  account  for  time  correlation  in  data. 
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4.  Reordered  or  recategorized  basic  variables.  Some  variables 
may  require  reordered  scales  and  some  may  provide  as  much 
information  with  fewer  categories. 


Careful  consideration  of  transformed  variables  represents  a  very  large 
effort  in  itself.  To  date,  time  and  resources  have  not  permitted  exploration 
of  most  of  these  topics,  except  to  consider  log  transformations  on  some 
variables  for  stabllzing  variances  among  flare  class  groups.  Additionally,  to 
account  for  suspected  interactions  involving  flare  persistence,  we  chose  in 
our  analysis  to  fit  distinct  models,  conditional  on  the  class  of  flare 
occurring  during  the  24  hour  period  before  the  forecast  is  made.  Partitioning 
of  the  data  into  separate  segments  for  analysis  avoids  introducing  covariance 
(interaction)  terms  when  a  variable  is  known  or  thought  to  affect  the  levels 
of  other  variables  (see  Bishop,  et  al.  (1975),  page  359). 

4.2  Selection  of  Basic  Variables 

Since  unavailability  of  data  must  be  considered  in  any  practical 
real-time  scheme,  potential  explanatory  variables  have  been  divided  into  three 
sets  based  on  the  observed  frequency  that  data  are  missing.  In  Table  3  we 
display  this  grouping  of  variables  and  the  number  of  cases  for  which  the 
values  are  available.  The  final  entry  in  each  column  is  the  total  number  of 
cases  in  each  of  the  four  partitions  of  the  data  base.  Because  we  have 
conditioned  on  the  largest  flare  in  the  past  24  hours,  cases  with  AGE«1  have 
been  discarded. 

A  largely  heuristic  approach  was  used  to  select  a  reduced  set  of 
explanatory  variables  from  white  light,  H-alpha,  and  historical  measurements. 
It  was  decided  to  rank  the  variables  according  to  some  measure  of  association 
with  variable  number  39  (largest  flare  in  the  next  24  hours).  Variable  46 
(largest  flare  today;  persistence)  was  used  to  partition  the  data  because  of 
the  argument  stated  in  subsection  4.1.  This  procedure  yields  four  separate 
rankings  of  the  variables.  On  this  basis  we  have  determined  nine  variables  to 
be  employed  as  primary  forecast  criteria  in  this  report.  Variables  not 
selected  by  this  ad  hoc  procedure  may  prove  useful  in  the  future  if,  for 
example,  Interaction  variables  are  allowed.  The  measure  of  association  used 
to  select  variables  is  described  below. 

First,  for  each  of  the  four  data  segments  one-way  analysis  of  variance 
(AOV)  F-ratlos  were  computed  for  every  explanatory  variable  with  variable  39 
as  the  grouping  variable.  Clearly,  if  "on  the  average"  the  level  of  a  given 
measurement  varies  with  the  event  to  occur  in  the  next  24  hours,  the  variable 
in  question  may  be  useful  to  predict  the  event. 

Because  many  of  the  explanatory  variables  will  not  satisfy  basic 
assumptions,  we  do  not  apply  the  usual  interpretation  of  the  F-ratio  or 
significance  levels.  In  this  case  we  use  the  statistic  represented  by  an 
F-ratio  to  determine  relative  degrees  of  utility  among  explanatory  variables, 
since  the  "F"  statistic  has  general  mathematical  properties  useful  to  study 
differences  among  sample  means. 


Table  3.  Number  of  Non-Missing  Observations 


Variable 

N 

Largest  Flare  Past 
C 

24  Hours 

M 

X 

MAGCLAS 

2985 

574 

159 

21 

SSDYNAM 

2968 

570 

158 

21 

SS INTER 

2982 

574 

157 

21 

STGDEV 

2963 

573 

157 

21 

SECTEOW 

2992 

574 

161 

21 

NEUTLOR 

2953 

563 

159 

20 

REVPOL 

2992 

574 

162 

21 

BRTPTS 

2970 

571 

158 

21 

PLAGFLUX 

2970 

571 

158 

21 

ISOPOLE 

2972 

570 

157 

21 

EFR 

2972 

570 

157 

21 

AFS 

2972 

570 

157 

21 

FIRSTAPP 

2992 

574 

162 

21 

FLUX 

2992 

574 

162 

21 

RECSPOT 

2985 

572 

159 

21 

PLAGFIL 

2698 

537 

141 

20 

NEUTLCOM 

2612 

524 

136 

18 

NEUTLCHG 

2492 

498 

128 

18 

ASSOCFIL 

2492 

502 

124 

18 

RV 

1385 

312 

84 

16 

MAGSTR 

1379 

307 

84 

16 

MAGGRAD 

1595 

296 

81 

14 

2992 

574 

162 

21 

The  F  statistic  chosen  to  select  variables  corresponds  to  the  F-ratio 
used  to  test  for  linear  trend  In  the  one-way  AOV.  Let  Fpj.  Fq,  Fm,  and  Fx 
represent  F  statistics  for  a  given  variable.  The  subscript  denotes  the  value 
of  the  conditioning  variable,  viz,  the  event  occurring  in  the  past  24  hours. 
Then  the  ad  hoc  measure  of  association  for  the  variable  Is  represented  by  the 
weighted  score 


WN  FN  +  WC  FC  +  WM  fM  +  WX  FX 
R  -  _ 


wN  +  wc  +  WM  +  WX 


where  Wjj  *  1;  Wq  ■  2;  WM  ■  3;  Wx  *4  .  The  weights  have  been  chosen 
arbitrarily  to  give  greater  Importance  to  variables  if  they  are  useful  to 
predict  flare  types  following  previous  flares.  This  simply  acknowledges  the 
fact  that  most  major  flares  tend  to  follow  other  flares. 


It  must  be  emphasized  that  our  variable  selection  procedure  is 
essentially  heuristic.  With  this  caution  in  mind,  scores(R)  and  individual  F 
statistics  are  presented  in  Table  4.  To  assure  that  F  statistics  can  be 
compared  within  groups  of  variables,  cases  have  been  omitted  from  the  analysis 
if  the  value  of  any  variable  within  the  particular  set  is  missing.  Thus 
scores  or  F  statistics  can  be  compared  within  groups,  because  all  F-ratios  are 
based  on  the  same  number  of  degrees  of  freedom.  A  relatively  high  F  value  is 
indicative  of  a  stronger  degree  of  association  to  flare  occurrence.  Scores 
have  been  truncated  to  the  Integer  part  of  the  computed  value. 

Based  on  variable  scores  from  Table  4,  we  have  selected  nine  variables  in 
three  groups  to  be  used  as  forecast  criteria  for  the  models  fitted  in  section 
5  of  this  report.  For  the  nine  explanatory  variables  sample  means  and 
standard  deviations  preceding  each  largest  flare  event  are  shown  in  Tables 
5-7.  The  variables  have  been  ranked  within  groups  according  to  their  weighted 
F  statistic  score.  Additionally,  statistics  have  been  computed  from  the  same 
cases  used  to  obtain  F  values  and  sample  sizes  for  each  group  of  variables  are 
given  at  the  end  of  the  respective  tables.  In  each  table  the  first  block  of 
entries  for  a  variable  are  sample  means  based  on  the  number  of  observations 
shown  at  the  bottom  of  the  table. 


Table  4.  Linear  Trend  "F"  Statistics 


Largest  Flare  Past  24  Hours 


Variable 

Score(R) 

N 

C 

M 

X 

MAGCLAS 

52 

150.8 

86.1 

56.3 

8.4 

SSDYNAM 

5 

21.8 

4.8 

2.6 

.5 

SS INTER 

1 

.3 

2.5 

1.1 

.9 

STGDEV 

3 

16.8 

2.7 

.6 

1.9 

SECTEOW 

0 

5.3 

.1 

.0 

.0 

NEUTLOR 

3 

22.7 

7.8 

.1 

.0 

REV POL 

1 

7.3 

1.0 

.0 

.8 

BRTPTS 

13 

78.5 

21.3 

3.4 

.0 

PLAGFLUX 

10 

83.7 

.4 

.3 

4.6 

IOSPOLE 

5 

12.7 

5.9 

7.9 

2.1 

EFR 

0 

.8 

1.7 

.4 

- 

AFS 

7 

44.2 

9.5 

1.9 

.9 

FIRSTAPP 

3 

4.6 

3.7 

6.1 

.8 

FLUX 

1 

9.2 

1.2 

.9 

.6 

RECSPOT 

47 

227.9 

71.2 

30.8 

3.0 

PLAGFIL 

8 

40.8 

11.2 

6.3 

.2 

NEUTLCOM 

22 

111.0 

31.4 

15.7 

.0 

NEUTLCHG 

4 

27.4 

5.2 

2.1 

.4 

ASSOCFIL 

l 

8.0 

1.6 

.1 

.3 

RV 

2 

.8 

.4 

.8 

6.1 

MAGSTR 

11 

18.7 

32.1 

7.3 

2.1 

MAGGRAD 

18 

45.5 

29.9 

24.8 

.2 

Table  5.  Means  and  Standard  Deviations  —  Set  1 _ 

Event  Next  Largest  Event  Past  24  Hours 

Variable  24  Hours  N  C  M  X 


r 


Table  6.  Means  and  Standard  Deviations  —  Set  2 


Variable 

Event  Next 

24  Hours 

Largest  Event 
N  C 

Past  24  Hours 

M  X 

NEUTLCOM 

N 

.84 

1.40 

1.58 

4.00 

C 

1.38 

1.83 

2.02 

2.40 

M 

1.52 

1.97 

2.61 

2.50 

X 

2.50 

2.37 

2.50 

3.40 

N 

.77 

.90 

1.11 

0 

C 

.98 

.95 

.98 

1.51 

M 

1.03 

.97 

1.17 

.83 

X 

.70 

1.30 

.57 

.54 

PLAGFIL 

N 

.99 

1.98 

2.17 

2.50 

C 

1.57 

2.42 

3.19 

2.60 

M 

1.47 

2.88 

2.74 

3.66 

X 

2.50 

1.87 

3.75 

2.80 

N 

1.29 

1.60 

1.46 

.70 

C 

1.49 

1.52 

1.24 

1.81 

M 

1.37 

1.46 

1.31 

1.50 

X 

3.53 

.99 

.95 

1.09 

Number 

N 

2108 

250 

41 

2 

of 

C 

228 

177 

41 

5 

Cases 

M 

23 

42 

31 

6 

X 

2 

8 

4 

5 

0 


2 


Table  7 .  Means  and  Standard  Deviations  —  Set  3 


Variable 

Event  Next 

24  Hours 

Largest  Event 
N  C 

Past  24  Hours 

M  X 

MAGGRAD 

N 

.04 

.08 

.06 

.10 

C 

.08 

.12 

.17 

.21 

M 

.06 

.17 

.16 

.32 

X 

.15 

.15 

.27 

.20 

N 

.05 

.08 

.06 

0 

C 

.06 

.08 

.09 

.13 

M 

.05 

.09 

.08 

.17 

X 

.14 

.10 

.08 

.06 

MAGSTR 

N 

15.38 

16.55 

17.26 

15.00 

C 

17.37 

18.16 

19.44 

21.00 

M 

17.43 

20.35 

20.04 

21.60 

X 

18.50 

20.66 

22.50 

22.50 

N 

4.61 

3.84 

3.97 

0 

C 

4.81 

3.59 

4.52 

4.83 

M 

4.42 

3.27 

3.68 

3.04 

X 

2.12 

4.63 

2.12 

3.10 

Number 

N 

1009 

143 

26 

1 

of 

C 

104 

98 

25 

4 

Cases 

M 

16 

28 

23 

5 

X 

2 

6 

2 

4 

4.3  Variable  Rescaling 

From  the  Information  provided  in  Tables  5-7,  it  was  decided  to  transform  some 
of  the  nine  explanatory  variables  to  a  log  scale.  The  decision  to  rescale  any 
given  variable  was  based  on  an  (Intuitive)  examination  of  sample  standard 
deviations  within  data  segments.  It  is  our  purpose  in  this  case  to  avoid  the 
introduction  of  quadratic  terms  into  the  analysis,  since  these  can  be  shown  to 
be  necessary  in  the  logistic  function  if  covariance  matrices  are  not  equal 
among  groups.  Again  our  decisions  are  heuristic  and  should  be  reconsidered  In 

anv  rnnftnniflnn  nf  t-M  <!  wrirl'. 
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The  nine  selected  forecast  criteria,  in  rescaled  form,  are 
Log  transformations  are  base  10. 


listed  below. 


x2 

x3 

x4 

x5 


Log 

(MAGCLAS 

+  .5) 

Log 

(RECSP0T 

+  .5) 

Log 

(BRTPTS 

+  1.5) 

PLAGFLUX 

+  1.0 

AFS 

+  1.0 

Log 

(NEUTLC0M 

+  1.5) 

Log 

(PLAGFIL 

+  1.5) 

(4.3.1) 


x8  -  Log  (MAG GRAD  +1.5) 
xg  -  MAGSTR  +1.0 


In  section  5,  probability  forecasts  are 
these  nine  explanatory  variables. 


obtained  for  both  LR  and  DA  using 


5.  EXAMPLE 


To  illustrate  the  ideas  of  section  3,  we  consider  estimation  of  the 
following  conditional  probabilities: 

(i)  Pr{Z-0|x] 

(il)  Pr[Z>l|x]  (5.0.1) 

(Hi)  Pr[Z>2|x] 

(iv)  Pr[Z-3|x] 

where  Z  ■  0,  1,  2,  or  3  corresponding  to  the  largest  flare  which  occurs  in  the 
next  24  hours  in  a  selected  region.  Four  separate  models,  identified  by 
subsets  of  the  nine  variables  in  (4.3.1),  are  considered: 

Model  I:  (x^,  • ••,  X5 } 

Model  II:  (x^ ,  ...,  X7 }  (5.0.2) 

Model  III:  {x^ ,  . ..,  xg  } 

Model  IV:  (x^ ,  x$  } 

Because  we  have  decided  to  condition  on  flare  persistence,  fitting  distinct 
models  following  each  flare  class,  Model  IV  is  included  to  forecast  cases 
following  class  X  flare  since  the  low  number  of  available  cases  dictates  that 
only  a  few  parameters  be  estimated.  Each  of  models  I-III  is  fitted 
conditional  on  No,  C,  or  M  class  flare  occurrence.  The  combinations  of 
multiple  models  and  conditioning  on  persistence  gives  rise  to  the  models 


indicated  by  an  in  Table  8.  Both  LR  and  DA  lead  to  fitted  models  which 
are  well  suited  for  on-line  forecasting.  For  the  examples,  the  methods  are 
compared  to  the  baseline  SESC  forecast. 


Table  8.  Forecast  Models 


Model 

No 

Largest  Event  Past 
C 

24  Hours 

M 

X 

I 

* 

* 

* 

II 

* 

* 

* 

III 

* 

* 

* 

IV 

* 

5.1  Case  Selection 

The  current  analysis  utilizes  on  6097  region-day  records  collected  from 
January  1,  1977  to  January  31,  1979.  These  records  were  first  reduced  to  4487 
records  by  the  elimination  of  records  indicating  the  absence  of  sunspots, 
since  such  regions  rarely  produce  flares.  For  each  of  the  conditional  models 
in  Table  8,  parameters  are  estimated  using  all  cases  with  non-missing  data  on 
variables  associated  with  the  particular  model  number,  according  to  the  lists 
in  (4.3.2).  That  is,  fitted  models  are  not  based  on  a  common  set  of  cases. 
Additionally,  because  the  number  of  cases  following  the  no  flare  event 
exceeded  computer  program  limitations,  models  for  column  one  in  Table  8  are 
estimated  using  cases  after  December  1977. 


5.2  Grouping  of  Cases 

As  k  increases,  the  polychotomous  logistic  model  requires  increasing 
numbers  of  parameters  to  be  estimated.  With  the  present  data  base  and 
approach,  some  of  the  models  must  be  estimated  based  on  a  relatively  small 
number  of  available  cases.  Because  the  estimation  of  a  large  number  of 
parameters  is  subject  to  criticism  under  these  circumstances,  it  was  decided 
to  use  a  dichotomous  (k-2)  model  at  each  stage  of  parameter  estimation.  This 
is  accomplished  by  regrouping  and  recoding  cases  depending  on  which 
probability  in  (5.0.1)  is  being  considered.  Each  combination  of 
model-persistence-event  leads  to  one  set  of  estimated  parameters,  avoiding 
estimation  of  a  4-category  model.  According  to  this  scheme,  Table  9  displays 
the  manner  in  which  cases  are  combined  for  each  model-persistence  combination. 
Note  that  Pr[Z”0|x]  is  not  listed  but  can  be  obtained  by  subtraction. 


_ Table  9.  Case  Grouping 

Probability _  Case  Groups 


Pr[Z>l|xJ 

Pr[Z>2|xJ 

Pr(Z-3|x) 


N  vs.  C,  M,  X 

N,  C  vs.  M,  X 

N,  C,  M  vs.  X 


We  conclude  this  section  by  remarking  that  the  (unorthodox)  scheme  of 
recoding  dependent  variable  cases  Into  a  dichotomy  at  each  stage  but 
reconstructing  a  4-category  type  analysis  will  generally  have  disturbing 
theoretical  implications.  Tvo  such  problems  are:  (1)  probabilities  may  no 
longer  be  additive  when  obtained  by  subtraction.  That  is,  \  Pr[Z«j|xJ  may  not 
be  1;  and,  (2)  if  the  equal  covariance  matrix  assu&ption  is  satisfied  in  the 
4-category  case,  it  may  necessarily  be  violated  by  constructing  a  dichotomy. 
For  these  reasons,  the  present  approach  should  be  viewed  as  ad  hoc,  and  should 
be  thoroughly  evaluated  if  continued.  It  is  likely,  however,  that  if  data 
management  problems  are  corrected,  the  resulting  increase  in  reliable  data 
will  allow  use  of  the  4-category  model,  thus  eliminating  the  need  for 
regrouping  of  cases. 


5.3  Validation 

In  most  analyses  of  the  logistic  type,  it  is  useful  to  divide  the  data 
cases  into  two  groups  — a  training  set  and  a  validation  set.  The  validation 
set  is  held  out  of  the  parameter  estimation  phase  of  the  analysis,  but  used 
later  to  cross-validate  the  probability  forecast  function  estimated  from  the 
training  set.  Normally,  if  the  data  are  unordered,  the  two  samples  are 
obtained  by  completely  random  subdivision  of  the  cases.  Considering  the 
sequential  nature  of  the  data  in  the  current  application,  a  more  logical 
approach  is  to  hold  out  data  after  a  selected  date  to  be  used  for  validation 
"into  the  future."  A  variation  of  this  idea  was  used  by  Hirman,  et  al. 
(1980)  using  a  sliding  data  window. 

We  agree  that  time  ordered  validation  is  the  sensible  approach  and  is 
naturally  consistent  with  real-time  evaluation  of  forecast  methods.  However, 
small  samples  generated  by  the  conditional  model  approach  have  precluded 
validation  for  the  examples  in  this  report.  We  have  chosen  to  use  all  data  to 
obtain  parameter  estimates  which  are,  hopefully,  more  accurate  and  precise. 
Model  validation  (or  invalidation  )  can  be  easily  accomplished  using  the 
"cleaned"  data  set  for  February  1,  1979  to  June  10,  1979  when  it  becomes 
available. 


5.4  Parameter  Estimation 

In  this  section  we  obtain  LR  estimators  of  (a,  j?)  for  each  model- 
persistence-event  combination.  Corresponding  DA  estimators  are  not  listed, 
but  have  been  used  for  comparison  in  later  sections. 

Let  the  flare  events  to  be  forecast  be  given  by 

Ex  -  {Z  >1  > 

E2  “  CZ  >2  ) 

E3  -  (Z-3) 
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and  let  represent  estimators  of  (  a,_6)  for  model  m  (i.e.,  1,  II,  III, 
or  IV),  persistence  value  p  (l.e.,  N,  C,  M  “or  X),  and  event  E^ .  Then  the 
probability  forecast  on  case  j  for  event  E^  la 


“  {l  +  exp  -  (a  +  0'xj)  }-1 


(5.4.1) 


where,  for  convenience,  we  have  dropped  subscripts  on  the  estimators  (a, 6). 
Here  jkj  denotes  the  set  of  observed  variables  on^ase  j  corresponding  to  the 
appropriate  model.  For  example,  with  model  I,  S  xj=  Elxl j+ *  * *+05x5 j.  Recall 
that  DA  or  LR  forecasts  derive  from  (5.4.1)  depending  on  the  estimation 
method  used  to  obtain  (a,  6). 

To  determine  the  usefulness  of  explanatory  variables  to  predict  flares,  a 
chi-square  test  of  significance  was  computed  for  each  model-persistence-event 
triplet.  Entries  In  Table  10  are  significance  levels  which  have  been  used  to 
identify  circumstances  where  explanatory  variables  (apparently)  do  not  provide 
Information  associated  with  flare  occurrence. 


Table  10.  Overall  Chi-Square  Significance  Levels* 


Largest  Event 

Flare 

Event  to 

be  Forecast 

Model 

Past  24  Hours 

C,  M  or  X 

M  or  X 

X 

I 

No  Flare 

.0000 

.0000 

C  Flare 

.0000 

.0000 

.3735 

M  Flare 

.0000 

.0000 

.0387 

II 

No  Flare 

.0000 

.0000 

__ 

C  Flare 

.0000 

.0000 

.2772 

M  Flare 

.0000 

.0002 

.1109 

III 

No  Flare 

.0000 

.0075 

_ 

C  Flare 

.0000 

.0000 

.2144 

M  Flare 

.0000 

.0004 

IV 

X  Flare 

.0012 

.0115 

.0339 

^Values  are  not  reported  If  the  data  are  Insufficient  for  stable  estimation  of 
parameters. 
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Tables  11-14  display  estimated  parameters  for  models  with  significance  levels 
less  than  .20.  Individual  underlined  coefficients  are  those  significantly 
non-zero  (at  the  5%  significance  level),  but  conditional  on  all  other 
variables  already  in  the  logistic  function.  Estimated  coefficients  correspond 
to  transformed  variables  though,  for  brevity,  we  have  identified  parameters 
with  the  original  variable  names.  That  is,  6j  corresponds  to  Xj  where 
(xi,***,X9)  are  defined  in  (4.3.1). 


Table  11. 

Estimated  Parameters — Model  1 

Largest 

Event  Past 

Parameter 

Flare 

Event  to  be  Forecast 

24 

Hours1 

Variable 

Estimated 

C,  M  or  X 

M  or  X 

X 

Nc 

i  Flare 

Constant 

a 

-  5.228 

-  7.487 

•2089  N 

MAGCLAS 

01 

2.852 

2.162 

• 

226  C 

RECSPOT 

h. 

lJi& 

1.586 

• 

32  M 

BRTPTS 

03 

TTfim 

OT 

• 

1  X 

PLAGFLUX 

04 

.4 

1.249 

AFS 

65 

"TU7 

-  15! 

C 

Flare 

Constant 

a 

-  3.600 

-  4.656 

• 

309  N 

MAGCLAS 

h 

4.092 

3.898 

• 

199  C 

RECSPOT 

h. 

157 

lW 

• 

54  M 

BRTPTS 

a3 

l17F 

-  .034 

• 

8  X 

PLAGFLUX 

*4 

-  ~~Mb 

-  .049 

AFS 

e5 

.428 

.037 

M 

Flare 

Constant 

a 

-  4.463 

-  3.050 

-  5.284 

* 

56  N 

MAGCLAS 

®1 

8.368 

4.024 

10.357 

• 

51  C 

RECSPOT 

02 

.407 

2.158 

.209 

• 

45  M 

BRTPTS 

03 

2.704 

-  ITU 

5.743 

• 

4  X 

PLAGFLUX 

®4 

-  .330 

.098 

1.247 

AFS 

b5 

-  .086 

-  .989 

-  9.226 

*Also 

indicated 

are  sample 

sizes  for 

flare  types  based  on  Variable 

39. 

/ 


Table  12.  Estimated  Parameters — Model  II 

Largest 

Event  Past  Parameter  Flare  Event  to  be  Forecast 

24  Hour 8 _ Variable  Estimated _ C,  M  or  X  M  or  X _ X_ 


No  Flare 

Constant 

a 

-  5.437 

7.955 

•1761  N 

MAGCLAS 

01 

2.724 

2.134 

•  206  C 

RECSPOT 

h 

1 .412 

1.510 

•  26  M 

BRTPTS 

h 

1.047 

1.619 

•  1  X 

PLAGFLUX 

e4 

.270 

1.213 

AFS 

$5 

.359 

- 

.3?4 

NEUTLCOM 

<*6 

1.98? 

1.610 

PLAGFIL 

07 

“7279 

.000 

C  Flare 

Constant 

a 

-  3.860 

- 

5.140 

•  270  N 

MAGCLAS 

01 

3.862 

3.638 

•  186  C 

RECSPOT 

02 

.928 

1.053 

•  47  M 

BRTPTS 

03 

1.425 

.326 

•  8  X 

PLAGFLUX 

04 

-  .15? 

- 

.037 

AFS 

05 

.500 

- 

.009 

NEUTLCOM 

06 

7558 

.315 

PLAGFIL 

07 

.279 

.638 

M  Flare 

Constant 

a 

-  6.056 

- 

3.635 

•  47  N 

MAGCLAS 

01 

7.876 

3.508 

•  44  C 

RECSPOT 

02 

“7 m 

2. 232 

•  38  M 

BRTPTS 

03 

2.853 

- 

.209 

•  4  X 

PLAGFLUX 

04 

-  .019 

.286 

AFS 

b5 

-  .295 

- 

.904 

NEUTLCOM 

06 

.310 

1.375 

PLAGFIL 

07 

2.442 

- 

1.161 
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Table  13. 

Estimated 

Parameters—Model  III 

Largest 
Event  Past 

24  Hours 

Variable 

Parameter 

Estimated 

Flare 
C.  M  or  X 

Event  to  be  Forecast 

M  or  X  X 

No  Flare 

Constant 

a 

-  8.791 

-  6.678 

•  557  N 

MAGCLAS 

h 

.523 

.574 

•  71  C 

RECSPOT 

h 

.779 

1.759 

•  12  M 

BRTPTS 

03 

1.171 

2.337 

•  1  X 

PLAGFLUX 

84 

.758 

1.439 

AFS 

05 

.518 

-  .530 

NEUTLCOM 

06 

2.782 

3.340 

PLAGFIL 

&7 

75T5 

.185 

MAG GRAD 

% 

9.749 

-16.455 

MAGSTR 

.065 

.085 

C  Flare 

Constant 

a 

-  5.574 

-  8.465 

•  132  N 

MAGCLAS 

4.132 

3.060 

•  89  C 

RECSPOT 

S2 

•  6>2 

1.988 

•  24  M 

BRTPTS 

03 

.934 

.350 

•  6  X 

PLAGFLUX 

64 

.110 

.184 

AFS 

05 

.392 

-  .379 

NEUTLCOM 

06 

.856 

-  .055 

PLAGFIL 

07 

.125 

.120 

MAGGRAD 

08 

-  .756 

.427 

MAGSTR 

09 

.100 

1.815 

M 

Flare 

Constant 

a 

-11.307 

-  5.371 

• 

25  N 

MAGCLAS 

0i 

10.884 

2.750 

• 

22  C 

RECSPOT 

02 

-  rsr? 

3.666 

• 

22  M 

BRTPTS 

03 

1.365 

-  1.663 

• 

2  X 

PLAGFLUX 

04 

-  .932 

-  .812 

AFS 

05 

-  .592 

-  .898 

NEUTLCOM 

06 

2.266 

4.583 

PLAGFIL 

07 

.474 

-  4.674 

MAGGRAD 

®8 

37.698 

16.784 

MAGSTR 

09 

.057 

-  .010 

Table  14.  Estimated  Parameters — Model  IV 


Largest 
Event  Past 

Parameter 

Flare  Event 

to  be  Forecast 

24  Hours 

Variable 

Estimated 

C,  M  or  X 

M  or  X  X 

X  Flare 

Constant 

a 

-  43.571 

-  3.560  4.709 

•  3  N 

•  5  C 

MAGCLAS 

®i 

105.350 

8.554  5.350 

•  7  M 

•  6  X 

PLAGFLUX 

04 

6.311 

-  1.416  -  9.131 

Associations  of  the  explanatory  variables  with  flare  incidence  may  be 
(roughly)  studied  by  inspection  of  the  estimated  prediction  equations. 
Estimates  of  Pj's  which  are  positive  are  indicative  of  positive  association 
of  the  corresponding  Xj  to  flare  occurrence.  Note,  for  example,  that  (very 
roughly  speaking)  plage  fluctuations  seem  to  be  either  positively  associated 
or  unassociated  with  the  occurrence  of  flares  except  following  the  occurrence 
of  X  flares,  when  the  absence  of  plage  fluctuations  may  be  associated  with 
persistence  of  large  flares.  We  emphasize  that  caution  must  be  exercised  in 
making  statements  like  the  one  above,  which  was  presented  to  illustrate  the 
interpretation  of  model  coefficients.  Clearly,  it  is  possible  that  apparent 
associations  are  purely  spurious,  so  we  should  take  great  care  to  Interpret 
results. 

Because  explanatory  variables  have  different  scales  of  measurement,  it  is 
not  possible  to  interpret  directly  the  magnitude  of  estimated  parameters  in 
Tables  11-14.  A  procedure  to  facilitate  interpretation  of  parameters  by 
computing  standardized  coefficients  is  available  and  should  be  considered  in 
any  continuation  of  this  work.  The  reader  is  referred  to  Bishop,  et  al. 
(1975)  for  a  discussion  of  this  technique. 


5.5  Model  Evaluation 

We  are  particularly  interested  in  a  comparison  of  actual  predictions  by 
the  objective  and  subjective  procedures.  In  this  section  Brier  scores  for 
SESC,  DA,  and  LR  forecasts  are  presented.  Recall  that,  at  this  time,  we  do 
not  have  reliable  data  for  cross-validation  of  results.  We  remark  here  that 
it  is  reasonable  to  suspect  that  some  cases  for  which  large  discrepancies 
between  SESC  and  objective  methods  exist  could  be  the  result  of  data 
tabulation  or  keypunch  error.  This  would  not,  in  principle,  be  a  problem  if 
suggestions  for  data  collection  and  management  could  be  successfully 
Implemented  (see  section  2).  Existence  of  outliers  will  disrupt  the 
estimation  of  parameters  and  evaluation  of  models. 


Probability  estimates  were  obtained  for  all  observations  using  estimated 
logit  functions  for  both  DA  and  LR.  Detailed  classification  tables  for  the 
cases  are  given  in  Appendix  B.  Because  Brier  (or  Information  Loss)  functions 
indicate  the  "average”  discrepancy  between  the  probability  estimates  for 
events  and  a  posteriori  probabilities  (viz,  1  for  the  event  which  occurred,  0 
for  other  events),  scores  are  more  informative  measures  of  forecast  method 
performance  than  classification  tables.  Brier  scores  are  given  in  Tables 
15-17. 

Tables  15-17  are  intended,  primarily,  to  illustrate  comparison  of 
probability  forecast  methods.  Emphasizing  that  LR  and  DA  scores  are  based 
on  the  same  data  used  to  estimate  parameters,  we  note  (with  caution)  that, 
where  applicable,  LR  generally  has  lcwer  total  scores  than  SESC  or  DA,  but 
differences  with  DA  are  nearly  negligible.  The  latter  result  was  expected 
since  the  log  transformation  tends  to  induce  normality.  Detailed  analysis  of 
scores  will  not  be  presented  in  the  absence  of  a  validation  data  set. 

In  conclusion,  we  remark  that  to  facilitate  interpretation  of  Brier 
scores  in  Tables  15-17,  one  may  compute  the  square  root  of  (B/2),  where  B  is 
the  table  entry.  The  result  can  be  thought  of  as  the  "average  deviation  of 
the  probability  estimate  from  1  for  the  event  which  occurred.”  For  example,  a 
Brier  score  of  .25  corresponds  an  "average  probability  deviation"  of  .35. 
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Table 

15.  Brier 

Scores  for 

Pr[C,  M  or 

X  Flare  I  x] 

Model 

Largest 
Event  Past 
24  Hours 

Event 

Observed 

Number 

Cases 

SESC 

Forecast  Source 
LR 

DA 

I 

No  Flare 

No  Flare 

2089 

.200 

.034 

.048 

C  Flare 

226 

.839 

1.361 

1.294 

M  Flare 

32 

.737 

1.268 

1.158 

X  Flare 

1 

.020 

1 .053 

.971 

Ml 

2348 

.269 

.1)9 

.183 

C  Flare 

No  Flare 

309 

.681 

.351 

.347 

C  Flare 

199 

.319 

.516 

.520 

M  Flare 

54 

.167 

.403 

.410 

X  Flare 

8 

.178 

.457 

.457 

All 

570 

.499 

.415 

.415 

M  Flare 

No  Flare 

56 

1.156 

.505 

.501 

C  Flare 

51 

.112 

.296 

.301 

M  Flare 

45 

.031 

.171 

.173 

X  Flare 

4 

.026 

.006 

.008 

All 

156 

.461 

.328 

.328 

II 

No  Flare 

No  Flare 

1761 

.204 

.038 

.050 

C  Flare 

206 

.832 

1.327 

1.273 

M  Flare 

26 

.700 

1.211 

1.103 

X  Flare 

1 

.020 

1.045 

.997 

All 

1994 

.275 

.187 

.191 

C  Flare 

No  Flare 

270 

.691 

.367 

.363 

C  Flare 

186 

.325 

.488 

.493 

M  Flare 

47 

.174 

.372 

.379 

X  Flare 

8 

.178 

.434 

.430 

All 

511 

.502 

.412 

.413 

M  Flare 

No  Flare 

47 

1.137 

.487 

.485 

C  Flare 

44 

.077 

.243 

.246 

M  Flare 

38 

.035 

.188 

.191 

X  Flare 

4 

.026 

.004 

.005 

All 

133 

.438 

.306 

.308 

III 

No  Flare 

No  Flare 

557 

.245 

.051 

.060 

C  Flare 

71 

.742 

1.208 

1.174 

M  Flare 

12 

.733 

1.066 

.994 

X  Flare 

1 

.020 

1 .073 

.989 

All 

641 

.309 

.200 

.202 

C  Flare 

No  Flare 

132 

.737 

.364 

.360 

C  Flare 

89 

.361 

.499 

.505 

M  Flare 

24 

.178 

.254 

.256 

X  Flare 

6 

.230 

.412 

.404 

All 

251 

.538 

.402 

.402 

M  Flare 

No  Flare 

25 

.941 

.327 

.286 

C  Flare 

22 

.062 

.236 

.261 

M  Flare 

22 

.028 

.151 

.184 

X  Flare 

2 

.013 

.000 

.000 

All 

71 

.360 

.235 

.239 

IV 

X  Flare 

No  Flare 

3 

1.433 

.296 

.426 

C  Flare 

5 

.013 

.044 

.152 

M  Flare 

7 

.109 

.032 

.012 

X  Flare 

6 

.010 

.000 

.001 

All 

21 

.247 

.063 

.101 
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Table  16.  Brier  Scores  for  Pr[M  or  X  Flare  I  x) 


Largest 
Event  Past 

Event 

Number 

Forecast  Source 

1 

Model  24  Hours 

Observed 

Cases 

SESC  LR 

DA 

No  Flare 

No  Flare 

2089 

.024 

.001 

.007 

C  Flare 

226 

.085 

.003 

.020 

M  Flare 

32 

1.518 

1.854 

1.683 

X  Flare 

1 

.080 

1 .767 

1.672 

All 

2348 

.050 

.027  “ 

.032 

C  Flare 

No  Flare 

309 

.126 

.019 

.021 

C  Flare 

199 

.270 

.055 

.067 

M  Flare 

54 

.919 

1.384 

1.350 

X  Flare 

8 

.768 

1  .427 

1.399 

All 

570 

.261 

.181 

.182 

M  Flare 

No  Flare 

56 

.327 

.104 

.103 

C  Flare 

51 

.717 

.284 

.295 

M  Flare 

45 

.343 

.704 

.697 

X  Flare 

4 

.420 

.344 

.313 

All 

156 

.461 

.342 

.343 

No  Flare 

No  Flare 

1761 

.024 

.001 

.007 

C  Flare 

206 

.088 

.003 

.020 

M  Flare 

26 

1.514 

1.842 

1.640 

X  Flare 

1 

.080 

1  .733 

1 .629 

All 

1994 

.050 

.026 

.031 

C  Flare 

No  Flare 

270 

.121 

.018 

.019 

C  Flare 

186 

.275 

.053 

.065 

M  Flare 

47 

.911 

1.366 

1.333 

X  Flare 

8 

.768 

1.455 

1.433 

All 

511 

.260 

.177 

.179 

M  Flare 

No  Flare 

47 

.294 

.112 

.112 

C  Flare 

44 

.701 

.276 

.282 

M  Flare 

38 

.368 

.706 

.702 

X  Flare 

4 

.420 

.390 

.357 

All 

133 

.454 

.344 

'.344  — 

No  Flare 

No  Flare 

557 

.034 

.002 

.010 

C  Flare 

71 

.081 

.006 

.015 

M  Flare 

12 

1.521 

1.730 

1.581 

X  Flare 

1 

.080 

1.599 

1.446 

All 

641 

.067 

.038 

C  Flare 

No  Flare 

132 

.117 

.022 

.026 

C  Flare 

89 

.262 

.074 

.084 

M  Flare 

24 

.845 

1.076 

1.040 

X  Flare 

6 

.710 

1.151 

1.167 

All 

251 

.252 

.168 

.171 

M  Flare 

No  Flare 

25 

.280 

.123 

.126 

C  Flare 

22 

.635 

.255 

.271 

M  Flare 

22 

.357 

.511 

.510 

X  Flare 

2 

.340 

.081 

.063 

All 

71 

.416 

.283 

288 

X  Flare 

No  Flare 

3 

.875 

.055 

.019 

C  Flare 

5 

1.201 

.616 

.62  3 

M  Flare 

7 

.112 

.282 

.291 

X  Flare 

6 

.042 

.071 

.051 

All 

21 

.460 

.269 

.262 
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Table  17.  Brier  Scores  for  Pr[X  Flare  |  x 


Largest 
Event  Past 

Event 

Number 

Forecast  Source 

Model 

24  Hours 

Observed 

Cases 

SESC 

LR 

DA 

I 

No  Flare 

No  Flare 

2089 

.002 

C  Flare 

226 

.004 

M  Flare 

32 

.003 

X  Flare 

1 

1  .280 

All 

2348 

.003 

C  Flare 

No  Flare 

309 

.009 

C  Flare 

199 

.029 

M  Flare 

54 

.051 

X  Flare 

8 

1.568 

All 

570 

.042 

M  Flare 

No  Flare 

56 

.020 

.001 

.001 

C  Flare 

51 

.115 

.013 

iS(tTiTw/55.1 

M  Flare 

45 

.142 

.008 

.006 

X  Flare 

4 

1.462 

1.505 

1.560 

All 

156 

.123 

.046 

II 

No  Flare 

No  Flare 

1761 

.002 

C  Flare 

206 

.005 

M  Flare 

26 

.003 

X  Flare 

1 

1  .280 

All 

1994 

.003 

C  Flare 

No  Flare 

270 

.007 

C  Flare 

186 

.030 

M  Flare 

47 

.052 

X  Flare 

8 

1.568 

All 

511 

.044 

M  Flare 

No  Flare 

47 

.022 

C  Flare 

44 

.127 

M  Flare 

38 

.131 

X  Flare 

4 

1 .462 

All 

133 

.131 

III 

No  Flare 

No  Flare 

557 

.002 

C  Flare 

71 

.005 

'M  Flare 

12 

.004 

X  Flare 

1 

1.280 

All 

~  671  ’ 

.004 

C  Flare 

No  Flare 

132 

.008 

C  Flare 

89 

.029 

M  Flare 

24 

.065 

X  Flare 

6 

1 .494 

All 

251 

.056 

M  Flare 

No  Flare 

25 

.024 

C  Flare 

22 

.104 

M  Flare 

22 

.129 

X  Flare 

2 

1.300 

All 

71 

.117 

IV 

X  Flare 

No  Flare 

3 

.128 

.011 

.013 

C  Flare 

5 

.108 

.143 

.147 

M  Flare 

7 

.297 

.309 

.311 

X  Flare 

6 

.951 

.602 

.603 

All 

21 

.415 

.311 

.313 

3 


6.  SUMMARY  AND  SUGGESTIONS 


Even  though  the  difficulties  encountered  in  the  course  of  the  study 
precluded  a  rigorous  analysis  of  the  data,  the  objective  technique  of  logistic 
regression  has  been  demonstrated  to  be  potentially  useful  for  probability 
forecasting  of  solar  flares.  This  conclusion  is  based  on  evidence  of 
statistical  association  of  solar  flare  incidence  with  many  of  the  region 
analysis  variables  collected  by  the  SESC.  Because  many  procedures  for  this 
study  were  developed  heur istically ,  all  conclusions  should  be  evaluated  with 
caution.  To  summarize  steps  of  the  analysis  we  list  important  considerations 
which  lead  to  the  examples  in  section  5: 


1.  A  priori  elimination  of  no  sunspot  cases  and  data  collected 
after  January  31,  1979. 

2.  A  priori  elimination  of  location  variables  and  a  few 
troublesome  region  analysis  variables. 

3.  Selection  of  discriminant  analysis  and  logistic  regression  as 
the  only  methods  to  be  considered  for  objective  probability 
forecasting. 

4.  Partitioning  the  data  into  four  segments  (conditional  on  the 
class  of  flare  occurring  during  the  past  24  hours)  to  avoid 
introducing  covariate  terms  into  the  analysis. 

5.  Heuristic  selection  of  nine  basic  explanatory  variables, 
divided  into  three  sets  depending  on  the  frequency  of 
missing  values. 

6.  Rescaling  of  some  of  the  nine  basic  variables  based  on 
(intuitive)  examination  of  sample  standard  deviations. 


In  the  examples  we  considered  the  prediction  of  Pr[C,  M,  or  X  Flare  I  x.1  > 
Pr[M  or  X  Flare  I  xj  ,  and  Pr[X  Flare  I  x_J ,  where  x  denotes  a  subset  of  the 
nine  transformed  variables.  These  three  probabilities  are  those  estimated  by 
the  SESC. 

Developing  the  full  potential  of  objective  forecast  techniques  will 
demand  great  effort  and  cooperation  on  a  broad  front.  We  have  identified  the 
central  considerations  for  continuation  of  this  work  to  include: 


1.  Devotion  of  sufficient  resources  for  data  collection, 
correction  and  management.  No  progress  can  be  expected  if 
reliable  information  is  not  available  (see  section  2.1). 

2.  Re-evaluation  of  objectives.  Are  flare  intensity  predictions 
desirable?  Would  forecasts  of  other  related  probabilities  be 
useful,  e.g.,  Pr[M  Flare  |xj? 

3.  Consideration  of  transformed,  lagged,  rate-of-change,  and 
interaction  variables. 
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4.  Stepwise  selection  of  variables. 

5.  Evaluation  of  the  need  for  conditional  models  to  account  for 
persistence. 

6.  Examination  of  adaptive  models  to  follow  secular  variation  in 
the  solar  cycle. 

7.  Study  of  spatial  and  time  correlation  (possibly  accounted  for 
by  conditional  models  or  inclusion  of  lagged  or  location 
variables) . 

The  Inherent  stochastic  nature  of  solar  flare  phenomena  should  dictate 
that  statistical  methods,  particularly  in  the  fields  of  multivariate  analysis 
and  stochastic  processes,  be  developed  in  the  direction  of  specific 
peculiarities  arising  in  the  flare  prediction  problem.  It  should  be  clearly 
pointed  out  that  submitting  solar  flare  data  to  various  cookbook  methods  will 
not  necessarily  yield  the  most  efficient  analysis.  Future  determination  of  a 
multitude  of  pertinent  features  of  the  data  could  very  likely  depend  on  the 
development  of  appropriate  theories  and  procedures  based  on  intuitive  leads  by 
solar  scientists  regarding  plausible  stochastic  models  for  solar  flares. 

Perhaps  the  single  most  important  consideration  for  further 
investigations  concerns  the  types  of  variables  to  be  recorded  and  their  scale 
and  time  of  measurement.  Understanding  based  on  stochastic  models  relating 
region  analysis  data  to  flare  occurrence  may  not  be  achieved  if  the 
informative  variables  are  not  first  determined  and  then  properly  collected. 
Such  basic  problems,  if  not  satisfactorily  resolved,  may  result  in  future 
solar  flare  records  which  cannot  possibly  provide  the  statistical  information 
of  Interest.  Thus,  the  need  for  statistical  planning  In  this  field  cannot  be 
exaggerated. 
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APPENDIX  A 


Region  Analysis  Variables 


This  analysis  is  based  on  data  reported  or  observed  during  the  past  24  hours 
only  (0000  UT  to  2400  UT)  and  should  be  completed  in  time  for  2200Z  forecast. 


Region  Analysis 


LOCATION 


1. 

DATE 

Year,  month,  day. 

2. 

REGION 

Region  number. 

3. 

APPL0NG 

First  appearance  longitude. 

4. 

CURLONG 

Current  longitude. 

5. 

NSLAT 

North  or  south  latitude-current. 

6. 

CURLAT 

Current  latitude. 

7. 

CARLONG 

Carrington  longitude. 

8. 

AGE 

Age  of  region  in  days  this  transit. 

WHITE  LIGHT 

9. 

SP0TCLAS 

Spot  Class 

None  observed  . . 

Spot  class  three  letter  code. .  . 

No  data 


0 

9 


10.  MAGCLAS  No  spots... .  0 

Magnetic  Class  Alpha . 1 

Beta . 2 

Beta-Gamma .  3 

Gamma . 4 

Beta-Delta .  5 

Beta-Gamma-Delta . 6 

Gamma-Delta .  7 

No  data .  9 

11.  RV  No  spots .  0 

Magnetic  Field  Red  (+  polarity) .  1 

Strengths  Violet  (-  polarity).... .  2 

Polarity  No  data .  9 


12.  MAGSTR 
Magnetic  Field 
Strengths 
(Largest) 

13.  MAGGRAD 
Magnetic 
Gradients 
In  Gamma/Km 

14.  SSDYNAM 
Sunspot 
Dynamics 


15.  SS INTER 
Interaction 
With  Another 
Region 

16.  STGDEV 
Stage  of 
Development 


H  -  ALPHA 

17.  LEADTRAI 
Leader  Emerged 
in  Leader  or 
Trailer  Polar¬ 
ity  Fields 
(from  Previous 
Synoptic  map) 

18.  RETREG 
Region  Number 
if  Returning 
Region 


No  spots .  0 

Two  digit  value . . . 

(if  same  use  polarity  of  largest  total) 

No  data .  99 

No  spots  or  unipolar  region .  0.00 

Enter  three  digit  gradient  as  N.NN . 

No  data .  9.00 

No  spots  or  not  applicable .  0 

Coalescing  of  spots .  1 

Spot  rotation .  2 

Relative  spot  motion  (opposite  polarity  spots)..  3 

No  data . 9 

None  occurred .  0 

Strong  spots  of  opposite  polarity  converge 

(from  less  than  2  degrees  apart) .  1 

No  data . .  9 

No  spots .  0 

Mature  group  (stable) .  1 

Decaying .  2 

Growing . . .  3 

Rapid  decay  (spot  or  area  decrease  by  >  50%)....  4 

Rapid  growth  (spot  or  area  increase  by  >  50%)...  5 

Rapid  growth  (spot  or  area  increase  by  >  100%)..  6 

No  data .  9 


Structure  not  definite.. .  0 

Returning  region .  1 

<  5  deg  of  NL  and  out  of  phase  with  NL .  2 

>  5  deg  of  NL  and  in  leader  polarity  fields .  3 

>  5  deg  of  NL  and  in  trailer  polarity  fields....  4 

<  5  deg  of  NL  and  in-phase  with  NL .  5 

No  data . 9 

Region  not  returning .  0 

Region  If  if  returning.... .  . 

No  data .  9 
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19.  SECTEOW  Sector  structure  not  definite .  0 

Relationship  Region  is  >  30  degrees  from  nearest  boundary....  1 

with  Nearest  Non-Hale  and  10  to  30  deg  west  of  boundary .  2 

Sector  Bound-  Non-Hale  and  10  to  30  deg  east  of  boundary .  3 

ary  (Hale  ■  Non-Hale  and  <10  deg  of  boundary .  4 

Region  Polarity  Hale  and  10  to  30  deg  west  of  boundary .  5 

Matches  the  Hale  and  10  to  30  deg  east  of  boundary .  6 

Boundary)  Hale  and  <  10  deg  of  boundary .  7 

No  data . 9 

20.  PLAGFIL  Non-compact  plage  and  no  filament .  0 

Plage  Compact-  Non-compact  plage  with  filament .  1 

ness  and  Embed-  Non-compact  plage  with  active  filament .  2 

ded  Filament  Compact  plage  without  embedded  filament .  3 

(Compact  •  NL  Compact  plage  with  embedded  filament .  4 

Corridor  >  2  Compact  plage  with  active  embedded  filament .  5 

Degrees  Wide)  No  data . 9 

21.  NEUTLOR  Weak  structure .  0 

Main  NL  Orien-  North-south  (+/-  45  degrees  to  NS) .  1 

tation  within  East-west . 2 

Plage  Hairpin  (E-W) .  3 

Mostly  Circular .  4 

No  data .  9 

22.  REV  POL  No  reverse  polarity .  0 

Orientation  Reverse  Polarity .  1 

Within  Plage  No  data .  9 

23.  NEUTLCOM  No  kinks  or  weak  structure .  0 

Neutral  Line  1-3  kinks  (very  simple  region) .  1 

Complexity  4-6  kinks  (simple  region) .  2 

7-12  kinks  (intermediate  region) .  3 

>  12  kinks  (very  complex) .  4 

No  data . 9 

24.  NEUTLCHG  No  definite  trend .  0 

Neutral  Line  Neutral  line  becoming  simple .  1 

Temporal  Neutral  line  becoming  complex .  2 

Changes  No  data .  9 


\i 


i 


i 


25.  ASSOCFIL 
Associated 
Fi lament 
(External  to 
Region  but 
Along  Common 


No  associated  filament .  0 

Filament  unchanged .  1 

Filament  growing . .  2 

Filament  disappeared  within  past  24  hours .  3 

Filament  darkens  or  Is  active .  4 

No  data .  9 


26.  BRTPTS  None  occurred .  0 

Bright  Points  Occurred  but  not  along  neutral  line .  1 

Occurred  along  the  neutral  line .  2 

No  data . 9 

27.  PLAGFLUX  None  occurred .  0 

Plage  Plage  fluctuations .  1 

Fluctuations  No  data. .  9 

28.  ISOPOLE  None  occurred  or  region  is  new .  0 

Isolated  Pole  Isolated  pole  in  region .  1 

No  data . 9 

29.  EFR  None  occurred  or  region  is  new .  0 

Emerging  Flux  New  EFR  emerges  within  existing  spot  group .  1 

New  EFR  emerges  near  region 

(within  5  degrees  of  existing  spot  group) .  2 

No  data . 9 

30.  AFS  None  present . 0 

AFS  Present  AFS  present . . . . .  1 

No  data . .  9 

RADIO 

31.  RADIOACT  None  occurred  or  small  events .  0 

Radio  Burst  >  250  flux  units  at  10  cm .  1 

and/or  Sweep  >1000  flux  units  at  10  cm... .  2 

Type  III  sweep .  3 

(Multiple  Type  IV  sweep .  4 

Entries  Type  II  followed  by  type  IV  sweep .  5 

Possible)  U  Burst .  6 

Major  and  complex  10  cm  burst .  7 

>1000  flux  units  at  10  cm  and  a  u  burst .  8 

Type  III  and  type  IV  sweep .  10 

>  250  flux  units  at  10  cm  and  type  III 

and  type  IV  sweep... .  11 

No  data .  9 

HISTORY  THIS  TRANSIT 

32.  FLAREHIS  None  occurred  or  first  day  observed .  0 

Largest  Flare  C  class  flares  have  occurred .  1 

Since  Region  M  class  flares  have  occurred .  2 

Appeared  X  class  flares  have  occurred .  3 

Including  Today  No  data  or  region  appeared  on  east  limb .  9 
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33.  FIRSTAPP 

Region  First 
Appeared 


34.  PROTHIS 

Proton  Event 


REGION  FORECASTS 

35.  CRFOR 

Flare  Forecast 

36 .  MRFOR 

Flare  Forecast 

37.  XRFOR 

Flare  Forecast 

38.  PRFOR 

Flare  Forecast 

EVENTS  THAT  OCCURRED 


39.  FLARER 

Largest  Flare 
for  this 
Region 


40.  PROTONR 
Proton  Event 
for  this  Region 

TOTAL  SUN  VARIABLES 

41.  FLUX 

10  Cm  Flux 


Formed  on  disk.. .  0 

Came  around  east  limb  -  first  transit .  1 

Second  transit . 2 

Third  transit . . .  3 

Fourth  transit..... .  4 

Fifth  transit  (and  etc) .  5 

No  data . 99 

No  particle  event . 0 

Proton  10  event  (»10p/cm*cm*sec*ster  at  >10mev) .  1 

Ground  level  event .  2 

No  data . 9 


C  probability 


M  probability 


X  probability 


Proton  event  probability 


DURING  NEXT  24  HOURS 


None  occurred  or  <C0 .  0 

Class  C .  1 

Class  M .  2 

Class  X .  3 

No  data  -  or  region  rotated  off .  9 

None  occurred . 0 

Proton  event .  1 

No  data . 9 


10  cm  flux  for  today 
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FORECAST  FOR  SUN 


42. 

CSFOR 

Flare 

Forecast 

43. 

MSF0R 

Flare 

Forecast 

44. 

XSF0R 

Flare 

Forecast 

45. 

PS  FOR 

Proton  Forecast 

C  probability . 

M  probability . 

X  probability . 

Proton  event  probability 


EVENTS  THAT  OCCURRED  DURING  THIS  24  HOURS 


46.  FLARERT  None  occurred  or  <C0 .  0 

Largest  Flare  Class  C .  1 

for  this  Region  Class  M .  2 

Class  X .  3 

No  data  or  region  rotated  off .  9 

47.  RECSPOT  No  spots  observed .  0 

Recoded  Spot  Less  than  10% .  1 

Class  Between  10%  and  20% .  2 

Between  20%  and  30% .  3 

Between  30%  and  50% .  4 

Between  50%  and  60% . 5 

Between  60%  and  100% .  6 

Between  100%  and  200% .  7 

Between  200%  and  300% .  8 

Spotclas  didn't  occur  in  last  eight  years .  98 

No  data .  99 

48.  PR0T0NT  None  occurred .  0 

Proton  Event  Proton  event .  1 

for  this  No  data .  9 

Region 
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APPENDIX  B 


Probability  Forecast  Tables 

The  purpose  of  this  appendix  Is  to  present  classification  tables  for 
SESC,  LR  and  DA  probability  forecasts,  based  on  the  analysis  presented  in 
section  5  of  this  report.  Where  applicable,  FLARER  is  crosstabulated  with 
appropriate  probability  estimates  using  the  SPSS  package  of  computer  programs. 
Variables  crosstabulated  with  FLARER  are  identified  by  a  one  letter  source 
code  followed  by  one  to  three  letters  indicating  the  event  which  is  forecast. 
For  example,  SCMX  denotes  the  SESC  probability  of  a  C,  M  or  X  flare  occurring; 
i.e.,  SCMX  is  equivalent  to  CRFOR  in  the  data  base.  Thus  LMX  is  the  LR 
forecast  of  an  M  or  X  event,  analogous  to  MRFOR  in  the  data  base. 

Tables  reported  in  B.l  -  B.3  are  for  Model  II.  Results  for  Model  IV  are 
given  in  B.4. 


Contents 


B.O  SESC  Overall  Classification 

B.l  No  Flare  in  Past  24  Hours 

B.1.1  Probability  of  C,  M,  or  X 
B.l. 2  Probability  of  M  or  X 

B.l. 3  Probability  of  X 

B.2  C  Flare  in  Past  24  Hours 


B.2.1  Probability  of  C,  M,  or  X 
B.2. 2  Probability  of  M  or  X 

B.2. 3  Probability  of  X 

B.3  M  Flare  in  Past  24  Hours 


B.3.1  Probability  of  C,  M,  or  X 
B.3. 2  Probability  of  M  or  X 

B.3. 3  Probability  of  X 


B.4  X  Flare  in  Past  24  Hours 


B.4.1  Probability  of  C,  M,  or  X 
B.4. 2  Probability  of  M  or  X 

B.4. 3  Probability  of  X 
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B.O  SESC  Overall  Classification 
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APPENDIX  C 


Cross tabula  Cions 

In  this  appendix  cros stabulat ions  of  most  solar  flare  variables  with 
FLARER  are  presented.  For  these  tables  the  data  are  not  partitioned  on  the 
past  flare  event,  but  conditional  tables  can  be  easily  obtained  using  the  SPSS 
programs.  The  reader  1 r  referred  to  the  SPSS  user's  manual  for  a  discussion 
of  the  statistics  listej  below  tables.  Variables  have  been  grouped  according 
to  the  categories  not'd  in  Appendix  A. 
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I 
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I 
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I 
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I 
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1C 

I 
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22 
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13*20 

21 

-3C 

31 
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POM 
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PCI 
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1 

2 

I 

3 

i 

4 

I 

5 

e 

113 
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i 
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X 

30 
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NXT 
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I 
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i 
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i 
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61.6 
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i 
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I 
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i 
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I 

.2 
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1 

23 
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I 
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I 
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I 

4 

566 
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1 

36.9 
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I 

.2 
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I 

13.0 

I 
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I 
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.5 
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T 
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I 

.9 

I 
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2 

b 
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i 

50 

I 

6 

J 

1 
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I 
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i 
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i 
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3.6 
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4.5 

r 

2.6 

I 
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i 

2.9 

_ 
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i 
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1 

ji 

i 
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• 

•-I— 
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3 

3 

11 

i 

7 
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1 

i 

0 

22 
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50.0 

T 
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i 
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2 
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T 
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I 

i 

0 
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.2 

T 
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i 
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I 
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I 
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I 
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1 

2 

3 
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I 

5 
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I 
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J 
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2,3 
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I 
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2? 

X 
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26 

I 
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I 
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I 
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4.  3 
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2.6 

T 
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ft 
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_ *.5 

j 
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I 
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3 
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1 

1 
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I 

6.5 

M.5 

1 
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•  1 
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! 

•  0 
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I 
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z 
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I 
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o  i  nos 

«0  FLARE  WIT  PAY  j _ 1»,T 
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I  I  88.8  T  35.2 

II  5.2  I  3.8 

P  I  -  *••_!  1.5 
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I 
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I 
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i 
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1 
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I 
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I 
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I 
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2 
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I 

I 
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I 
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-I 
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2  I 

3 

I 
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1 

3 

i 

1H 

21 

X  CLASS 

NEXT 

OAY  1 

76.2 

4.6 

6.6 

14.3 

: 

0 

.9 

i 

•  w 

1.9 

4.3 

6.9 

i 

0 

I 

•  * 

.0 

.0 

•  1 

i 

0 

r 

-f  — 

— — •! 

COLUMN 

4  ?9B 

(6 

lb 

34 

73* 

4414 

TOTAL 

97 .6 

1.9 

•  4 

•  6 

6 

100.0 

RAM  CHI  SQUARE 

«  IAS. 62937 

WITH 

•  DEGREES 

OF  freedom. 

SIGNIFICANCE  • 

0 

xendallxs 

TAU 

c  • 

.62690. 

SIGNIFICANCE  • 

•  0090 

CAHHA  * 

.roe  19 

SOMERSXS  C 

USYhmETPIC)  ■ 

42444  KITH  FLARE* 

DEPENDENT 

■  .07293 

NITm  SSOTNAM  DEPENDENT. 

SOMERSFS  D  IStmmETBIC)  •  •  1 2 3<«6 

UMBER  OF  MISSING  OSSERVATIONS  *  73 


SSINT” 

_ COUNT  _1 -  -  - 

ROM  PCT  INO  INTrR  SPOTS  NO  06TA  »OM 
COE  PCT  IACUON  CONVERGE  TOT  EL 

.  TOT  PCT  I  0  1  IT  91 

FERRER  - 1- . -I - T - - 1 

0  1  JESS  I  27  I  SON  I  9* BO 

NO  flare  MKT  o«T  I  99.1  I  ,7_J _ 0 _ I _ 12.9 _ 

I  6  3.2  I  SB.F  I  C  I 

I  62.3  I  .61  Cl 

,  -I— . -I - 1 - - 1 

1  I  969  I  13  I  EM  I  562 

C  CL6SS  NEXT  OAT  I  67.8  I  2.2  I  0  1  13. 1 

_ _ I  11.0  I  26.3  I 0  _  T _ 

I  12.6  I  .It  61 

-I— - ; - j - j 

,  2  1  IS1  I  6  1  6M  I  199 

N  CLASS  NEXT  OAT  I  67.6  I  2.6  I  01  3.1 

I  3.6  I  6.7  I  (I 

_ _ 1  3.6.  T  _  .1  t  _ 8  I  _ 

-I - 1---- - 1 - - —I 

3  1  10  I  2  1  IN  I  21 

■  CLASS  NEXT  OAT  I  60. *  t  9.5  I  01  .5 

I  .6  I  6.3  I  0  1 

I  .61  .01  01 

“  “  COLUMN  "  6392~  *6~  AON  ~66  5P 

TOTAL  69.0  1.0  0  100.0 
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I 

I 
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i 
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i 
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i 

2 

i 

3 

4 

5 

I 

4 

i 

9 

I 
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! 
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65 

42 

1 

40 

i 
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I  3515 
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I 
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2 
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•5.1 

i 

74.6 

22.6 

66.7 

I 
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1 

0 

46.  B 

17.5 

I 

21.5 
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l.S 

I 

.9 

I 

* 

11 4 
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i 

71f 

• 

17 

I 

1C 

1 

25M 

561 

3A.1 

16.? 

X 

37.4. 

1.4 

3.0 

1 

1.6 

I 

13.2 

13.3 
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T 

12.6 

13. « 
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I  It. 9 

1_ 

0 

5.0 
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I 

4.4 

.7 

•  4 

I 

.2 

I 

59 

77 

I 

6? 

3 

6 

I 

3 

I 
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14.41 

I 
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7.0 

7.4 

I 
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I 

0 

3.6 

1.9 

2.5 

T 

1.2 

5.2 
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1 

5.7 

I 

0 
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i 
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.1 

tl 

I 
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•I 
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5 

6 

i 

7 

? 

1 

I 

O 

I 

7* 

20 
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I 

35.0 
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0 

I 

t 

J 

0 

.5 

0? 

.7 

i 

•  5 

T.k 

0 

1 

I 

I 

0 

•  1 

•  1 

i 

.2 

.0 

« 

1 

t 

T 

D 
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«7! 

1191 

58 
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53 

_ *3»H~ 

4249 

NT. 3 
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1.6 
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1.7 

( 
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Ran  CHI  SQUARE  ■  *1.9*311  WITH  l«  OECBETS  OF  RREFOON.  SIGNIFICANCE  *  .ROOD 
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SOMERSFS  D  (ASTMmETRICI  «  .  0  697  6  WITH  FLARER  OEPENOENT.-  "  •  .153*7  NITH  STGOEV  DEPENDENT. 

SOMERSFS  0  ISTNMETRICI  •  . *95  0? 
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C.3  H-Alpha  Variables 


LIA3TRAI 

_ iewNi— i  _ _  _ 

R0«  RCT  ISTRUCt  RETUFN 
COL  RCT  InOT  Oif  REGION 
tot  RCT  I  0  I  1 

Mnp  - 1 - 1-- - 

o  i  to  i  ms 

_ no  f  lor w«t  opt  i  _  i, i  1  _  27 , i 

I  47.6  I  ro.j 

1  .o  1  22.6 

-I - 1 - 

II  1  I  109 

C  CUSS  NOT  DOT  I  .2  I  34. 1 

I _ 1.4  I  _15 , 4 

I  .01  4.5 

-I - 1 - 

2  i  si  ri 

H  CLP  SS  NOT  OPT  I  0  1  44.1 

I  0  I  5.5 

_ _ I _ »  I  »•* 

-I - - 1 - - 

31  01  11 

i  Class  not  opt  i  o  i  so.o 

i  o  I  .• 

I  0  I  .2 

_ il.*”— "I" . 

COLUMN  41  1297 

TOTAL  .4  20.4 


«  5  ML  ►  5  NL  »  5  Nl 
OUT  RMAS  LCPO'R  TRAILER 
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I 

r 

i 

i 
i 
T 
I 
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I 
I 

I 
I 
1 
I 


.3 


1 

4.4 

.1 

.0 
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14.4 


I 

I 

I 

I 

I 

*_I- 


.7 


1 

4.5 

.1 

.0 
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15.|  I 
•<>.5  X 
17.6  1 

•  35 

•  6. I* 
16.0 

i 

i — 
T 

747  I  530 

?C.!i_l__l4.3 
•7.7  I  14.1 
16.4  I  11.1 

_ T _ 

16  T 

56 

I 

121  I 

•  2 

14.7  I 

16.6 

I 

20.1  I 

1W.0 

12.9  _I _ 10.3 

L_1S.2_ 

13.0 

1.4  I 

2.1 

I 

m  T 

2.7  I 

1.6 

14  1 

3G 

T 

SI  I 

16 

1.7  I 

19.6 

I 

16.6  7 

1.4 

2.1  I 

9.? 

I 

9.3  I 

2.5 

31 


7 

t 

.0 

.2 


«  5  NL 
IN  RHPSE 
5  I 

I 

I  ' 
I 

I 
I 

_  r 

i 
i 
i 

i 
i 
i 
i 


•  ON 

total 


3710 

«2»A- 


506 

13.1 


161 

J.B 


j4 


22 

.5 


436 

20.4 


•15 

ft. 4 


630 
14.  C 


4417 

100.0 


«AN  CHI  SOJPRC  •  51.02204  with  14  OEG»ErS  07  FREEDOM 

FENDILLAS  TPU  C  •  >.02600.  SIGNIFICANCE  ■  .1011 

GANNA  ■  ___•  2_0 0 4 7 6  _ _  _  _ 

lOHERSAS  0~1  ASTMNE  T® 1C  I  •  -.02471  MITM  FLARES  D£°f  RDENT 

SO.ERSaS  O  ISTNMETRICI  •  -.13600 


SIGNIFICANCE  ■  .1100 


■  -.16630  NITh  LEA0TRAI  OERENDEnT 


•etre; 

_CQUNT  _JL  _ _ 

RON  PCT  INOT  RET  RETURN  RON 


COL 

RCT 

i 

TOTAL 

TOT 

PCT 

i 

B 

I 

1 

i 

c 

i 

2703 

I 

1615 

i 

3710 

_ Mg  FJ.APj 

NIT  0*7 

i 

J2.7 

I 

27.1 

i 

!2.2_ 

i 

04.7 

I 

76.3 

i 

i 

63.2 

I 

22.6 

i 

1 

i 

36b 

I 

200 

i 

406 

C  CLASS 

next 

06T 

i 

65.4 

I 

34.1 

i 

13.1 

i 

I  15.4 

i 

i 

6.1 

I 

6.5 

i 

2 

i 

40 

I 

71 

i 

16) 

R  Class 

ME  IT 

OAT 

i 

15.4 

I 

44.1 

i 

3.6 

i 

2.0 

I 

5.5 

i 

i 

_J.t 

I 

1.6 

i 

« i  • 

3 

i 

11 

I 

11 

i 

27 

R  CLASS 

«fir 

oi  r 

i 

»»•» 

I 

50.0 

i 

.5 

i 

.0 

I 

•  6 

i 

i 

.7 

I 

•  7 

T 

- --I— . -I 

COLUMN  1140  1742  4417 

TOTAL  Tl.l  21.4  lll.t 


IAN  CHI  SOJARE  .  14.74742  NITh  1  DEGREES  OF  FREEDOM.  SIGNIFICANCE  R  .Mil 

CENOALLFS  TlU  C  R  .15550.  SIGNIFICANCE  «  .1110 

GAMMA  •  .71341 

10MERSA6  0  t  AS TM METRIC  I  •  .16751  NITH  FLARE*  DfRENflFNf .  m  .15414  NITm  RETREG  OERENDEnT 

I  'MIASAS  O  ISTNMETRICI  •  .17161 
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SECU3W 


-COlUfT  _I 


»o«  pct  isTRjcr 

COL  PC T  1  NOT  OIF 
lot  PCI  I  0 

-I 

I  I  1 

WO  FLORE  ml  DAT  I  .0 


NON-MALE  NOM-NALE  MALE 
tl-!l  E  <10  two  11-10  N 
17  6  1  5 


halt  male  no  data 

10-10  E  <11  ONI 
I  »  J  r  i 

I— - - 1 

I  117  1  661  1 


tt  06 
100.0 


OAK  chi  SQUARE  •  56.15563  KITH  El  DECREES  OF  FREEDOM.  SIGNIFICANCE  «  .0000 

OENOAllFS  T  AU  C  ■  . 0  JO 53 •  SIGNIFICANCE  •  .0061 

1j__6ANHA  •  _i066»l _ _  _ _ _  _  _ _ _ 

SORER  SOS  0  USymhETRICI  •  .07017  WITH  FLARE*  DEPENDENT.  •  .05710  Nil N  SECTION  DEPENDENT. 

SONERSFS  0  (STNNETRICl  >  ,17910 

NUMBER  OF  MISSING  OBSERVATIONS  ■  1 


placfu 

_ COUNT _ I _ _ _ _  _  .  _ _  _ 

RON  PCI  INON-COM  NON-COM  NON-COM  COMPACT  COMPACT  COMPACT 
COL  PCT  INO  FILM!  Nl  FILM!  ACT  FK  NO  Fit  NT  Nl  FUNT  ACT  FJL 
TOTPCTI  01  II  El  SI  61  51  6 

FlAREB  - 1 - 1™ . I - 7 - 1 - 1 - 1 - 

0  r  165.  I  IT*  I  113  I  660  7  111  I  67  I  0 

NO  FLARE  MKT  PAT  I  5J.3  1  70.6  I  <u  0_J_  .1 9,.}  _I _ 4,l_l_JUA-_l _ 0 

I  90.3  I  07.5  I  69.5  I  71.5  I  60.7  I  67.0  I  0 

I  61.  S  I  16.9  I  1.1  I  16.1  I  1.1  I  1.7  1  0 

I  I  151  I  75  I  67  I  196  I  67  1  15  1  0 

C  CtOSS  NEAT  OAT  I  77.6  I  11.6  T  7.7  I  15.7  I  F.6  I  1.6  I  I 

_ i _ «.7_I _ 9 '  7 _ I _ i  7  .  J  I  11,7  1  76.1 _ X  _.li,7_l _ fi 

I  l.a  I  1.9  I  1.1  I  6.9  I  1.7  I  .91  I 

-I - - 1 - - 1 - 1 - J - - J - j - 

7  I  76  I  19  T  11  I  56  I  11  I  16  I  0 

N  CLASS  NEAT  OAT  I  17.5  I  16.1  I  0.1  I  60.0  1  9.0  I  10.6  1  I 

I  1.1  I  1.5  1  5.9  I  6.0  I  6.7  I  16.1  I  f 

_ I _ »»_L  j«_J-  .L  I _ 1.4_J _ I  _ <_ 

-I . -I - 1- . — T . I - - ----I - 

II  71  tl  61  71  71  71  1 

B  CLASS  NEAT  OAT  I  10.0  I  10.0  I  70. S  T  15.0  I  10.0  1  10.0  1  5.0 

I  .11  .11  f.l  I  .6  I  1.0  I  t.l  I  HI. I 

I  .11  .11  .11  .7  1  .11  .11  ,0 

COLUMN  "  "lAli  *  *"  7TI~  ”*  107  ~  "  093 _  19 5~  ll"^  r 

TOTAL  66.0  19.6  4.7  77.9  6.9  7.6  .1 


X 


NO  DATA  ROM 
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9  I 

— . I 


660M  I  17T 6 


0  I 
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67N  1  566 
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_1_  X _ 
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I  I  S.6 
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_•  X _ 
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•  I  .5 
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0 Om£ OS FS  0  (STNNETRICl  •  .70767 
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TOT  PCI 


M£UTLOr 
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I  0 


FLARER 


■0  FLARE  HIT  Q>Y 


C  CLASS  NEIT  OAT 


N  CLASS  NEST  OAT 


■  CLASS  NEXT  OAT 
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J-.t 

•  4.; 
2fl.i 


(III 
.  6L»  I 

04.5 

SO  >6 


74 

12.4 

T.J 

iir ' 


34 

22.1 

1.3 

.1 


2 

4.S 

.2 

.0 
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T3.0  I  41.1  I  20.6 

6.6  I  1.4  T  3.0 

. I . I . 

*11  1A  1  43 

1..1  !  6.1  I  4.2 

?«•?  T-.St-V  L_JJ.6 
1.0  T  .«  I  1.2 

- . I - 1 - 

21  I  14  I  15 

14.4  I  4.1  I  4.T 

5.2  I  12.4  I  6.4 

.5  J  .?  I  .  si 

. I - 1 - 

*  I  II  1 

14.0  I  14.1  I  4.0 

1.0  I  2.7  I  .4 

.11  .11  .0 


MOrHT  MO  DATA 
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$7.4 

_12.4 

7.5 


60 

44.2 

2.6 
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11 

52.4 
.4 
.2 

2636 

54.4 
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4.1 
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2.6 
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5.3 
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S 

0 
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0 


TM 

0 

0 

« 


1H 

0 

0 

0 

A7M 

0 
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11.0 


15s 

l.S 


21 

.5 


4400 

100.0 
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.00026.  SIGNIFICANCE  •  .6000 
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I 
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I 
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c 
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n 
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I 
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I 

i 
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s 

T 

21 

1 

I 

22 

I 
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.5 
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T 
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T 
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.4 
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• 
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3 
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SIGNIFICANCE  • 
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TOT 
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4.0 

s\  _x_ 

£  2 

i*.  3 

I 

93.1 

AS.T 

i 

46.5 

49.7 
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.1247*.  SICNIF1 CAHCC  •  ,1AM 


.14194  KITH  FLARER 
.119*9 


DEPENDENT. 


.19919  KITH  AIRSTAPP  OEPEHOENT 


■  V  ■  (KftfeSiiSNW  * 


74 


PROTMlS 
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COL  PCT  I CAL  EVNT 
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